May 11, 2017
The Democratization of Dingers
Many baseball fans know that in 1927, Babe Ruth hit 60 home runs. Lou Gehrig hit 47. The Cubs’ Hack Wilson hit 30, as did the Phillies’ Cy Williams. The Giants’ Rogers Hornsby hit 26, and his teammate Bill Terry had 20. That’s it—nobody else hit 20 or more round-trippers.
In 1927, there were 16 major-league teams. Let’s call the top 16 home run hitters in the majors—one per team—the "elite" sluggers that year. Ruth, Gehrig, Wilson, Williamson, Hornsby, Terry, and 10 guys who hit 14-19 bombs. Those 16 players hit 372 home runs that year. Across the majors, there were 922 home runs. So the elite 16 players accounted for over 40 percent of all home runs that year. Since 1920, the year Ruth hit 54 home runs, 1927 is the only season in which the elite home run hitters—defined as the top n in the majors, where n equals the number of teams—hit over 40 percent of all homers.
Fifteen years earlier, in 1912—two years before Ruth’s first appearance in the majors, not that that’s in any way relevant—the Italian statistician Corrado Gini created a measure, now known as the Gini Coefficient. The Gini Coefficient measures inequality among values in a distribution. It ranges from 0 (perfect equality) to 1 (perfect inequality). Gini proposed it as a measure of income or wealth inequality, and that’s its most common usage, though it can be used to measure the inequality within any dataset.
Here, for example, is the Gini Coefficient for disposable income inequality among the 35 (mostly wealthy) countries in the Organization for Economic Cooperation and Development, or OECD:
The most egalitarian countries in the OECD are Iceland, Norway, and Denmark. The most unequal are Chile, Mexico, and the United States. Further, several countries, including the U.S., have become more unequal since the Great Recession; those countries are represented by the blue bars with orange lines within them rather than over them. By and large, the Gini Coefficient is consistent with public perception. Scandinavian countries are notably egalitarian, the U.S. is pretty stratified. (Though what’s up with Chile?)
This isn’t an attempt to swerve from a discussion of home runs to one of income inequality. Rather, it’s an attempt to use Gini’s model to look at home run distributions. Baseball in the 1920s was very unequal when it came to home run production. A few players generated a large proportion of homers, and many hit very few, if any.
Contrast that to last season. There were 30 teams, so let’s call the 30 players who hit the most home runs "elite." Mark Trumbo, Nelson Cruz, Khris Davis, Brian Dozier, Edwin Encarnacion, Nolan Arenado, Chris Carter, and Todd Frazier all hit 40 or more. The 30th-most homers by an individual last season were 31, a total Mookie Betts, Yoenis Cespedes, Albert Pujols, Yasmany Tomas, and Justin Upton all met. The top 30 home run hitters, combined, went deep 1,096 times—49 percent more than the elite home run hitters in 1927, adjusting for the difference in league size and games played. Nonetheless, the elite 30 accounted for only 19.5 percent of the 5,610 home runs hit last year. That’s the lowest proportion of all time.
This trend shouldn’t be a surprise. Home runs were relatively rare back in the 1920s. In 1927, Ruth out-homered every team but the Cubs, Giants, and Cardinals (and the Yankees, of course). Even Wilson and Williamson, with 30 each, out-homered four teams. By contrast, there hasn’t been a team with fewer than 60 homers since the 1986 Cardinals, and the last time teams didn’t hit 30 was during World War II.
Let me show this graphically. This one shows the percentage of all major-league home runs hit by the elite sluggers, as defined above, for every season from 1920 to 2016:
As you can see, the top 16 home run hitters routinely accounted for a third of all homers through the 1930s. The proportion fell steadily from the start of World War II through the late 1950s, when in plateaued around 24 percent. It bounced around there for several years, then began another decline in the 1980s. The three years with the fewest home runs hit by the top sluggers are 2016, 2013, and 2012.
Let’s see whether the Gini Coefficient confirms this. I took every batter with 50 or plate appearances in a season since 1920 and, for each year, calculated a Gini Coefficient based on the number of home runs each batter hit. (Note: You can really test the limits of Excel by having it calculate array formulae on a table with over 40,000 rows.)
This confirms the results in the earlier graph showing the percentage of home runs hit by elite sluggers. The distribution of home runs is a lot more equal now that it was 90 years ago. The five most equal non-strike seasons, per the Gini Coefficient, are, in order, 2015, 2013, 2016, 2006, and 2005. The five most unequal are, in order, 1927, 1920, 1931, 1929, and 1926.
And that’s evident by looking at the distribution of home runs. Last year there were a record number of homers. And, as noted, eight players hit 40 or more. But that’s nowhere near the record. There were 17 players with 40 or more homers in 1996, 16 in 2000, 13 in 1998 and 1999, 12 in 1997 and 2001 ... you get the idea. Eight players with 40 or more home runs is tied for only 12th-most all time.
But last year there were 30 batters with 30-39 home runs. Only 1999, with 32, and 2000, with 31, had more. And players with 20-29 homers? There were 73 of them in 2016, crushing the old record of 64 set in 2008.
See the pattern? We got a record number of home runs last year, not by a few players hitting a ton of them, but from many, many players getting 20 or more. There were 111 players with 20 or more home runs in 2016. That’s what made the record—a lot of players getting a decent number of home runs, not a Ruth and a Gehrig vastly outperforming their peers.
There’s a sense that early baseball was dominated by a handful of incredible athletes, while the contemporary game is brimming with them. From the perspective of home runs, at least, this seems to be the case. Home runs aren’t as equally distributed as disposable income—Chile’s 0.47 Gini Coefficient for income is lower than 2016’s 0.54 Gini Coefficient for homers—but I think we can also agree that home run distribution doesn’t rise to the level of a societal problem, either. Whatever the underlying cause, the rising tide of home runs seems to be lifting many, many boats.