The late literary scholar Michael André Bernstein offers a compelling argument against one way we read history in a literary fashion. He argues that “foreshadowing” in historical narratives lead to interpretations of the past as if the once future present were pre-determined. Instead, Bernstein suggests that historical narrative should deploy “sideshadowing,” which he defines as “a gesturing to the side, to a present dense with multiple . . . possibilities for what is to come” (page one in the book linked above). By doing so, we can acknowledge that there is nothing natural about a present condition, but that said condition was the product of individual choices, the alternatives of which illuminate both the past and the present. We can use Bernstein’s principle of sideshadowing in the historical development of baseball statistics.
This post is inspired by Boston University’s and edX’s Sabermetrics 101 free online course. The course is an introduction to baseball sabermetrics, and it offers four learning tracks: sabermetrics, statistics, technology/databases, and history. I’m most attracted to the history track because it’s my day job, but there’s also much to learn from the other tracks, all of which intersect. In particular, the technology and history lessons from the first course module led me to create an alternative batting statistic that very well could have been used instead of batting average. I then created a list of the best batting seasons in the history of the Rockies according to my alternative statistic.
The metric is flawed but plausible, and I made it based on what a mid-nineteenth century recorder and analyst of baseball statistics valued. Andy Andres, who has been teaching the course at BU since 2004, explained that one of the first persons to record baseball statistics was a man named Henry Chadwick. In one of Chadwick’s rudimentary scorecards from the nineteenth century, Andres shows that Chadwick placed value on two individual outcomes: runs scored and outs made. Andres proposes that students can create queries for the Lahman database of baseball statistics to generate metrics that Chadwick might have valued for the evaluation of individual players, such as runs scored per game.
I took his suggestion and added another layer to it. Because Chadwick seemed to balance the positive outcome of runs scored per game with the negative outcome of outs made per game, the metric I created accounts for both. The statistic is the differential between outs made per game that resulted from an at bat and runs scored per game, with the assumption that a smaller number is better because it demonstrates a small gap between the positive and negative outcomes. So a player season with the most runs scored per game and the least amount of outs made per game should be ranked pretty high. This is the formula I used: ((AB-H)/G)-(R/G). I set the minimum number of at bats at 350, and the query yielded 137 player seasons. Finally—and if I may channel Chris Chrisman for a moment—a dumb statistic needs a dumb name. The statistic measures offensive efficiency by placing value on runs scored and penalizing outs made, so I call the statistic Efficiency of Offense, or effo for short. And so, without further delay, here are the top 20 Rockies seasons, as measured by effo:
And here are the five worst, ranked with the most futile at the top.
If I told you that 11 out of the 20 best run producing seasons in Rockies history were produced by Larry Walker and Todd Helton between 1997 and 2005 according to a metric that is neither park nor era adjusted, you would not have been surprised. That is precisely what effo shows. You also shouldn’t be surprised to discover that nine out of the 20 seasons took place between 1993 and 2001, the pre-humidor days of Coors Field. The remaining 11 occurred afterward. You might, however, be surprised to find names such as Corey Sullivan, Tyler Colvin, and Ryan Spilborghs on the list. Those seeming anomalies, based on our perception of how good those players are compared to absent names such as Matt Holliday and Troy Tulowitzki, can be explained by digging into the component parts of effo. For example, Larry Walker’s 1999 effo was the product of making 2.14 outs per game, but scoring 0.85 runs per game. Conversely, Corey Sullivan in 2005 only produced 0.46 runs per game, but he only made 1.92 outs per game. Sullivan’s great effo season can be explained by a limited number of at bats, 378. That is, he didn’t have as many opportunities to make outs. The bottom of the list is also unsurprising and includes light hitters Neif Perez, Marco Scutaro as member of the Rockies, and Walt Weiss. Similar to the way that a lack of at bats might land one at the top of the list, an abundance of at bats can cause a player to be at the bottom. Neifi Perez’s 1999 wasn’t great, but the primary reason he brings up the rear is because he led the league in at bats that year with 690. He had more opportunities than anyone else to make an out, which suppressed his effo.
For the sake of comparison, here are the top 20 seasons in Rockies history according to batting average, as well as the five worst. Again, I set the minimum number of at bats to 350.
Take a look at the top and bottom of the batting average ranking. Larry Walker’s .379 batting average in his MVP 1997 season remains the best ever for the Rockies. In that season, Walker also produced the best effo in Rockies history at 1.29. Now let’s draw our attention to the bottom. If we discount Perez’s 1999 when he led the league in at bats, the worst Rockies season according to effo was Clint Barmes’s 2006, when he ended with a mark of 2.41. Notably, this was also the worst qualifying season in Rockies history according to batting average. He finished at .220. Not only that, but 12 of the 20 best batting average seasons are also the best effo seasons in Rockies history.
Of course, effo isn’t perfect. It requires a look at the component parts such as runs scored per game to get a fuller understanding of offensive production. But the understanding of batting average is not only enhanced with the use of batting average on balls in play, but I would argue is necessary to understand it. Effo also masks team production as individual skill, as run scoring requires a player to get on base as well as a teammate to get a hit behind the runner. But then again, runs batted in, a canon statistic, generally needs a runner on base in order for a player to accrue RBIs.
Batting average has been so thoroughly ingrained into our baseball consciousness that we forget that it is a creation of human observers. Not only that, but its establishment and perpetuation by baseball organizations, fans, and the media, has come at the expense of other metrics. A sideshadowed telling of the development of baseball statistics reasonably has the entities that manage the development of the game favoring effo over an obscure and unilluminating statistic like batting average because the former tells us something about scoring runs while the latter does not. I’m not saying that we start using effo, even if it’s fun to say. I am, however, suggesting that we recognize that no statistic is any more natural than another. Batting average, effo, and even something like weighted on base average are each discoveries based on the material provided by the game. To that extent, each of these metrics has always existed as a collection of parts. They’ve just awaited someone to put them together and advocate for their advancement.