clock menu more-arrow no yes

Filed under:

Review of Big Data Baseball

New, 1 comment

Travis Sawchik has written one of the best baseball books of the past decade. His overview of the Pirates 2013 season is about how the team effectively used "big data", along with good coaching and scouting, to end a 20-year losing streak.

Robert Mayer-USA TODAY Sports

The book Moneyball (2003) explained how the Oakland A's front office helped revolutionize baseball decision-making by using statistical analysis to acquire players whose skills were undervalued.  With the release of Big Data Baseball: Math, Miracles and the End of a 20-Year Losing Streak, the statistical tools described in Moneyball seem quaint by comparison.  From 2003 through 2013, there has been a exponential leap in data that is now available to major league clubs.  Pittsburgh Tribune-Review reporter Travis Sawchik examines how the Pittsburgh Pirates made use of new data to break their string of 20 consecutive losing seasons by advancing to the playoffs in 2013.  The book is a compelling and engaging narrative of the ways that the Pirates integrated the latest advances in sabermetrics with more traditional scouting approaches.  It is also a primer on how the sabermetric revolution continues to change how baseball is played, especially given the explosive growth of big data now available to major league teams as a result of research and technological innovation.  When Bill James published his Baseball Abstracts in the early 1980s, there were 200,000 total data points per season.  This expanded to about 1 million in 1990 with Project Scoresheet and to 20 million with PITCHf/x by 2008.  With the introduction of new player-tracking technology in 2014 and 2015, the numbers have soared to approximately 2.4 billion data points (p. 61).

Sawchik understands that this vast increase in data collection requires human intelligence to interpret what all this data means.  The brilliance of the book is the way the author weaves the human interest stories with a history of the how baseball statistics have evolved.  The narrative opens with Pirates General Manager Neal Huntington going to the home of manager Clint Hurdle in the offseason of 2012 after another second-half collapse pushed the Pirates below .500 for the 20th consecutive season (2011 had been much the same).  Huntington needed to talk with Hurdle about strategies to avoid another disappointing season.  The Pirates would not be able to rely on big spending, as the ownership would allocate only $15 million to sign free agents.  And the farm system, while very good and expected to produce quality big league talent, was still more than a year away from helping the big league club as nine of the top ten prospects were not expected to reach the majors in 2013.  Huntington felt that the only way the team could maximize its chances was to exploit new statistical data on run prevention.

Previous research undertaken by Bill James (Project Scoresheet) and expanded considerably by John Dewan (Baseball Info Solutions) enabled the Pirates front office to examine detailed batted ball and pitch-by-pitch data as it pertained to the effectiveness of infielders in preventing runs.  Starting in 2002, Dewan began collecting the most rigorous data on fly balls, line drives and ground balls ever assembled.  His research revealed that "major league hitters hit ground balls to their pull side 73 percent of the time.....line drives to their pull side 55 percent of the time...and fly balls to their pull side only 40 percent of the time" (p. 37).  The Pirates' fledging analytics division, initially staffed with just one person, Dan Fox, and expanded later to include Mike Fitzgerald and others, used their own statistical analysis to build upon the implications of Dewan's research for the Pirates organization.  The conclusion was that the Pirates should be radically shifting their infield defense on most major league hitters to save runs.  Starting in 2012, the Tampa Rays and the Milwaukee Brewers employed the shift most aggressively, with an estimated 8 or 9 win improvement for Tampa and a 5 or 6 win improvement for Milwaukee.  The Pirates started introducing some of these defensive shifts in their minor league system as early as 2008, not initially due to sabermetric research but instead to the scouting instincts of Perry Hill, currently the Marlins first base coach who was then the Pirates minor league infield coordinator.  Hill felt that infielders were getting beat consistently on balls hit to their pull side, and he had already engaged in discussions with the Pirates' front office, including the newly hired sabermetrician Dan Fox, about his thoughts.

Hill's practices reinforced the research that Fox was undertaking about the statistical benefits of defensive shifting.  This contributed to the Pirates experimenting with more radical defensive shifts in their minor league organizations based on both the insights of infield instructor Hill and the data provided by Dan Fox.  When Huntington met with manager Clint Hurdle, he would ask the manager to oversee the implementation of substantial defensive shifts at the major league level for the first time in 2013.  Given the history of using these shifts successfully in the minors, Huntington was able to persuade Hurdle of the merits of trying them at the big league level.  Both Huntington and Hurdle understood the importance of the Pirates coaching staff and players "buying into" such a radical change in big league strategy.  The fact that Huntington suggested that there should be less separation between the Pirates' sabermetric analysts and the on-field personnel led to the decision to make analysts Dan Fox and Mike Fitzgerald available in the Pirates clubhouse during home games.  Interspersed throughout the book we see Fox and Fitzgerald interact with both the Pirates coaches and players so that there would be a "give and take" between the analytics and its actual implementation on the field.  Defensive positioning strategies were modified to take advantage of insights from coaches and players, and adjusted based on the particular strengths of Pirates' pitchers.

In fact, the Pirates' also combined sabermetrics with traditional scouting approaches in targeting free agent pitchers and in adjusting the approaches of Pirates pitchers to maximize the effectiveness of the defensive shifts.  Thanks to the innovations associated with PITCHf/x tracking data, developed in 2007 by the Chicago-based company Sportsvision, and in every major league stadium by 2008, there was a goldmine of data that had not previously existed.  Sawchik describes PITCHf/x as a technological revolution that works as follows: "The cameras and object-recognition software capture images of a pitch's flight from the time it leaves a pitcher's hand until it crosses home plate. From the images the speed, trajectory, and three-dimensional location of the ball are tracked in real time....For the first time, the exact speed of a pitcher's throws and the exact percentage of times he threw certain pitches could be tracked" (p. 60).  The Pirates made successful use of this technology in 2012, when the Pirates pitching coach Ray Searage and special assistant Jim Benedict used the data to make recommendations to newly acquired free agent A.J. Burnett to rely more on his sinker than he ever had in his career, thereby generating more ground balls and reducing his home run rates.  The Pirates coaching staff also encouraged Charlie Morton to get away from the "high arm slot approach" that had been taught to him in the past, in favor of a lowered arm slot that would allow him to more effectively throw a sinker ball, which also would generate more ground balls.  The shift to ground ball pitchers became an important focal point of the Pirates off-season of 2012, when the hard-throwing Francisco Liriano was signed as a free agent due to the combination of velocity and movement that convinced the Pirates staff that he was capable of being highly effective.  Once signed, Searage worked with Liriano on using his two-seam fastball much more to generate more ground balls, which complemented the radical defensive shifts the team began to use in 2013.

The PITCHf/x technology also was used, thanks to the pioneering work of Dan Turkenkopf, to help the Pirates' Dan Fox understand the "hidden value" of a catcher's pitch framing ability in targeting free agents for the 2013 season.  Fox looked at how effective catcher Russell Martin had been throughout his career in converting border-line balls into strike calls for his pitcher, which PITCHf/x data allowed researchers to track.  Martin clearly had a skill-set that was being underappreciated in the baseball marketplace, and Fox insisted that the Pirates make him their primary free agent target as a result.  Martin's pitch-framing prowess had saved the Yankees 23 runs in 2012, which many teams did not see if they focused on his .211 batting average.  Throughout his career, Martin was better than his superficial numbers suggested, and over an extended period of time, had been at the top (or near the top) among catchers in saving runs for his teams.

The Pirates of 2013 succeeded in maximizing their resources and their talent just as Huntington hoped would be the case when he had that first offseason meeting with manager Clint Hurdle in 2012.  According to some measures, the team added about 7 or 8 wins with their defensive shifts in 2013, which contributed to their first winning season in 20 years.  The fact that the team made the playoffs and won the wildcard game against the Reds made baseball exciting again in Pittsburgh, where the Pirates history of futility and the supremacy of football stacked the deck against a revitalized baseball fan base.  The Pirates not only managed to triumph over such adversity, but they embodied how the sport is coming of age as the latest sabermetric revolution takes hold.  This revolution has ushered in a gigantic leap in data collection, a race between organizations to see who can best manage and interpret the data in ways that helps their clubs win, and a battle over how to jump-start the next statistical edge before your opponents do.  Teams constantly react to short-term trends in an attempt to counter them:  thus the Oakland A's looked to get more fly ball hitters in 2014 in response to teams such as the Pirates emphasizing defensive infield shifts and two-seam fastballs.

Where do our Miami Marlins stack up in this analytic competition?  Unfortunately, the Marlins were so deficient in their use of analytics in 2013 that the team employed the defensive shift less than any other club in the sport.  Big Data Baseball (pp. 107-108) lists the MLB teams in rank order of their use of defensive shifts in 2013 and 2014, which indicates that just about every club increased their use of defensive shifts.  The Marlins were the only one of the 30 teams not even to appear on the list, which is both a typo in the book (they should have been included even if they were last!) and also a metaphor for how far behind the Fish have been in the analytic revolution.  I tracked a Wall Street Journal list from 2013 that in fact does include the Marlins, but shows them to be a distant 30th in defensive shifts employed.  In 2014, the Marlins did escape last place in the defensive shift category, ranking ahead of organizations like Detroit and Washington, but were still among the bottom teams in the use of the defensive shift.

Over the last two years, the Fish have started to make some changes to their front office.  An analytics division, however small, was first created in 2014 and is being expanded in the offseason of 2015.  The recent Marlins' hires from the Pirates organization, including Jim Benedict, who was a special assistant to the Pirates GM Neal Huntington, bode well for a more forward-thinking approach.  Benedict will serve as the Marlins' Vice President of Pitching Development.  The team's President of Baseball Operations (and now General Manager) Michael Hill, has said that bolstering the analytics division is the top priority of the offseason after the recent hiring of Don Mattingly as their manager. Let's hope that it's not a flash in the pan, because the biggest revolution in analytics has happened this past year with the introduction of Statcast technology that promises to revolutionize how clubs measure defense, speed, pitching and hitting by monitoring such things as spin, launch angles, players' defensive reaction times and routes to balls, and baserunners' size of leads and quickness of first steps.  But to take advantage of this, organizations need to be on the cutting edge of hiring the best analytics people to do their jobs, and more importantly, just as the Pirates did, to integrate their insights into managing, coaching, scouting and player development.  The Marlins under this ownership may have finally started to figure this out, but is it too little, too late?