Okay, this post is going to be a long one, but by the end of it, you will havecreated yourself a Retrosheet database for MySQL or MariaDB. There is a postsimilar to this on Beyond the Boxscorecalled Saberizing a Mac #9: Retrosheet (part1);however, it is severely lacking details on how to get the database actuallyset up your Mac.
- Download Retrosheet Play By Play File In Mac Free
- Download Retrosheet Play By Play File In Mac Download
- Download Retrosheet Play By Play File In Mac Os
In order to parse the .EVN and .EVA event files from Retrosheet, we need toinstall the Chadwick tools.Before we can do that we need to install Homebrew which is a package managerfor OSX.
Download XCode for Mac
Download Retrosheet Play By Play File In Mac Free
![Play Play](/uploads/1/3/3/2/133275674/644264723.jpg)
You can install XCode directly from the Mac App Store. We need thisto install the command-line tools.
![Download Download](/uploads/1/3/3/2/133275674/145733429.png)
Get the Command-Line Tools
Open up your terminal which is located in
/Applications/Utilities/
, and inthe terminal run the following command.
Install Homebrew
Now that we have the command-line tools, we can finally download and installHomebrew. Keep your terminal app open and run:
Once, it is installed run
brew doctor
and if you get the messageYour system is ready to brew
you are golden. If you do not, thispostis an excellent troubleshooting resource.
Retrosheet Full Play by Play? I’m trying to get the full play by play for retrosheet for 2019, but ideally any year. When I download it, it gives me it separates it by team and even then it’s incomplete. If anyone has any fixes or advice I would be thankful. BEVENT.EXE: Creates play-by-play data file suitable for import into database programs, spreadsheets, and custom written programs.Enter the DOS command 'bevent -h' for more information. version update: BOX.EXE: Generates traditional box scores from Retrosheet data files.Enter the DOS command 'box -h' for more information. These tools don’t include any data, nor do they fetch it from the Internet. Before using them, you’ll need to get some baseball data. See “Get Historical Play-by-Play Data” Hack #14 for more information. A company called DiamondWare developed a set of three tools for Microsoft Windows for processing Retrosheet–style event files. In particular, the event (a.k.a. Play-by-play) files can be especially difficult to parse. This package does the parsing on those files, returning the requested data in the most practical R structure to use for sabermetric or other analyses.
If you already have MySQL or MariaDB installed on your computer, you can skipthis section and move onto installing Chadwick.Once we have homebrew installed we can build MariaDB which is a fork of MySQLthat has better performance. Reasons for choosing MariaDB over MySQL can befound here.To install we run:
Once that is completed run:
You can make it so every time you log in to your computer the MariaDB/MySQL serverstarts up, but I prefer to start it manually any time I need it. To start itall you have to do is run in your terminal.
Once you start up the server, you logging in is very easy.
A couple weeks ago I was finally able to get Chadwick added to Homebrew. This makesinstalling the it so much easier on a Mac.
Chadwick is an open-source program that allows one to parse the event filesfrom Retrosheet. Suppose I have the event files from the 2014 season, which canbe downloaded directly from Retrosheet.I then unzip the files and navigate into the directory in which it was unzipped.
I am interested in the Braves home opener from 2014, which was against theNew York Mets. Using the chadwick tool
cwbox
, I am able to print out the boxscorefrom that game to my terminal. The steps below illustrate how this is accomplished.
Okay, we are finally ready to begin building the retrosheet database. First download this zip file which contains the sql fileswe will need and empty folders named unparsed and parsed. Unzip this fileinto your home directory. The unparsed folderis what we will download the Retrosheet event files into, and the parsed folderwill contain the csv’s after we parse the event files. Here is what the retrosheetfolder tree will look like.
Once again I recommend moving the unzipped folder after you download it into your home directory because the loader sql files tell the database server to look forthe data in ~/Retrosheet/parsed/. If you want this folder somewhere else you’llhave to edit the sql loader files.
Downloading the Data
We can actually download the data directly from Retrosheet using the terminal andwget. This really makes it easier than downloading them indiviudally. Follow the steps that are shown below to download the data zip files by decade, beginning withthe 1950’s. Once we download them, we will unzip the files as well.
Download Retrosheet Play By Play File In Mac Download
If you open up the unparsed folder in your finder you will find a folder full of .EVA, .EVN, and .ROS files. These are the Retrosheet files that Chadwick will parse.
Parsing the Data
Download Retrosheet Play By Play File In Mac Os
Since we now have all the data we need, we can finally start parsing the data.Using three for loops in your terminal will allow us to parse all the data from 1950to 2014. One loop parses the events, the games, and the subs into csv files thatcan be loaded into the database.
Creating and Populating the Retrosheet Database
Using the sql files that are included in the zip file, we can create and populatethe Retrosheet database. The first script, 01_schema.sql defines the Retrosheetdatabase schema. It creates an events_bck, events, games_bck, games, and subs tables.The events_bck and games_bck tables are what we will load the data into using 02_load_events.sql and 03_load_games.sql. We load the subs into the subs table using 04_load_subs.sql. To insert the data from events_bck and games_bck into the eventsand games tables we use the 05_partition.sql. In this file we create the “YEAR_ID”columns which the tables are patitioned by and copy the rows from the two loadertables into our final tables. Lastly, our lookup tables are stored in 06_lookup_codes.sql. Thisalso contains the players table which holds all the player ID’s from each player in the Retrosheetdata.
The structure for these tables is suggested by Colin Wyers from the Hardball Times,and all I did was make modifications that would make the files compatiable on a Mac.His orginial post is entitled Building a Retrosheet Database, the Short Form.
In the code below, you can see that our first step is to start the mysql server.The next step creates the Retrosheet schema, and the others populate the database.Notice the –local-infile=1; we have to add this because this featurehas been disabled by default because of security issues.Using that flag though allows it temporarily, and rest assured there is nothingin those sql scripts that should cause you any worry.
We do not have to specify a database because each of the loader sql scriptsinclude the following in the first line.
As a side note, if you created a password for your server (which is not covered herebecause this database is meant to remain local), you will have to use the -pflag.
At this point you should have a Retrosheet database on your Mac. If these steps did not work,you found any errors, want more elaboration, or have any other general questions please emailme at [email protected]. In future postsI will demonstrate some of the SQL queries you can perform using the Retrosheet data.
Posted with : MLB , Retrosheet , SQL