BP Comment Quick Links
![]() | |
May 3, 2012 Overthinking ItSpoiling the Bunch
If you’ve paid any attention to the 2012 season, you know that Albert Pujols has yet to hit a home run. The three-time MVP, fresh off the first homerless month of his career, is hitting just .208/.252/.287 with career-worst walk and strikeout rates. Jered Weaver’s no-hitter last night temporarily deflected some attention away from Albert’s struggles. But while Weaver mowed down Minnesota, Pujols’ homerless streak was extended to 107 plate appearances, ensuring that scrutiny of his every swing will only intensify once the no-hitter hubbub dies down. Pujols averaged 39 home runs for the Cardinals over the past five seasons. After factoring in some age-related decline and the difficulty of hitting home runs from the right side in Angel Stadium, PECOTA projected him to hit 33 in 2012. The probability that a 33-home-run hitter would go homerless over 107 plate appearances by chance alone is just .3 percent. Either Pujols has been extremely unlucky, he’s declined more quickly than PECOTA expected, or he’s pressing at the plate. Privately, Pujols is probably feeling some pressure. Publicly, though, he claims to be unconcerned. “I don’t think about that, man. It could be tomorrow, maybe the next day, a month from now, I don’t know. My job is to get myself ready to play and take my swing. Home runs, when they come, they come in bunches.” At this point, Pujols would probably settle for hitting homers in dribs and drabs, let alone bunches. According to his comments, though, when he does start hitting homers, they’ll add up in a hurry. But can Albert be believed? The belief that home runs are hit in bunches—in other words, that they’re hit in flurries followed by droughts, rather than at regular intervals—isn’t unique to the struggling Angels star. When Bryan LaHair went homerless this spring, Cubs manager Dave Sveum said, “People forget that home runs come in bunches.” Since then, they have for LaHair, who has hit six in the regular season, if not for the Cubs, who have collectively hit fewer home runs than Matt Kemp. But the history of “home runs in bunches” goes back well beyond Bryan LaHair. Writers and players alike have been referring to the idea at least since the middle of last century: in 1958, Willie Mays said, “When I hit home runs I get them in bunches and then no more for a time.” Is there anything to this, or is “home runs are hit in bunches” another baseball myth that deserves to be busted? Google “clutch hitting,” “pitching to the score,” or a host of other time-honored baseball beliefs, and you’ll find countless studies that have tried and failed to find any statistical evidence supporting them. The contention that homers are hit in bunches seems to have escaped investigation so far, but it’s just as easy to check. Using a statistical concept called binomial distribution, we determined the theoretical rates of zero-, one-, two-, three-, and four-homer games for the average major-league batter. By comparing those predicted rates to how often those games actually occurred, we could see whether there was anything to the idea that home runs are hit in bunches. If players actually alternate between home-run hot streaks and dry spells, their long balls would be bunched together, and we would see higher rates of two- and zero-homer games and lower rates of one-homer games than predicted. Over small samples, of course, some players do have more two-homer games than predicted. In 78 games last season, Mike Cameron hit nine home runs, six of which came in three two-homer bunches. Those three two-homer games were about 2.6 more than the model would have predicted. Cameron was one of five players to have at least two more two-homer games than he “should” have in 2011:
Over larger samples, though, we don’t see correspondingly large differences. Of the 211 players with at least 3000 plate appearances from 2002-2011, only nine had at least five more two-homer games than expected:
It’s possible that Vlad’s homers had some slight tendency to be “bunched,” but even in his case, it’s likely that the difference was due to chance. So what’s the verdict when we look at home-run distributions for all players? The following table shows the predicted and observed percentages of games in which an average major-league batter hit each number of home runs from 1994-2011. The model predicted that the average player would go homerless in 89.29 percent of his games, hit one homer in 9.99 percent of his games, and hit two homers in 0.68 percent of his games. The predicted and observed results are almost identical, and the slight differences aren’t significant.
Here are what those percentages look like for Pujols’s career. As one would expect, both the theoretical model and the in-game results show that he’s been much more likely to go deep than the typical player, but he hasn’t had more multi-homer games than expected.
So why do Pujols and so many other players mistakenly believe that they’re hitting home runs in bunches? A cognitive bias called the "availability heuristic" might be to blame. According to Amos Tversky and Daniel Kahneman, the psychologists who coined the term, the availability heuristic is our “tendency to make a judgement about the frequency of an event based on how easy it is to recall similar instances.” The easier it is to summon instances of an event to our minds, the more often we believe that event to occur. For hitters, few events are more memorable than a multi-homer game or a long stretch without hitting a homer, so it’s not surprising that those events seem to them to happen more often than they do. Home runs aren’t really hit in bunches, but it’s probably in the Angels’ best interests not to burst Albert’s bubble. There could be some psychological benefit to believing in bunches. In the midst of a home-run barrage last May, Mark Teixeira explained his success by saying, “Home runs come in bunches, and right now I’m just in one of those streaks where I’m hitting them out of the park a lot.” After ending the longest homerless stretch of his career in July of 2009, Teixeira used the same reasoning to explain his struggles: “I’m a streaky home run hitter. They come in bunches, and after hitting a bunch in a row, it took a while to get another one.” Teixeira’s all-purpose explanation suggests that while hitting homers in bunches isn’t fact, it is a useful fiction. One of the most important qualities for a hitter to have is confidence, and the “bunches” belief provides a confidence boost for any occasion. A player who has homered recently can go to the plate believing he’s mid-bunch and about to hit another. A player who hasn’t homered in ages can console himself with the thought that a bunch of long balls could be a game away. What Albert Pujols could really use right now is a homer. But some confidence can’t hurt. Colin Wyers provided research assistance for this article.
A version of this story originally appeared on ESPN Insider
Ben Lindbergh is an author of Baseball Prospectus. Follow @benlindbergh
|
Do you think Pujols (and other 'bunches' proponents) meant per game, or in any given week or month? Kemp just hit 12 in April, but with only one 2-homer game.
As with all April stats and streaks, would this be anywhere near as big a story if a) Albert were still a Cardinal or b) this happened in August? Or May, even?
John, I don't think Pujols and others are referring to multi-homer games specifically. However, the way Colin Wyers put this to me when I asked him about it is, "A game is just a really short streak." In other words, if hitters went through periods where they were "bunching" homers, we'd see that reflected in the rate of multi-homer games as well as over longer stretches. Looking at it this way was less computationally intensive.
This wouldn't be as big a story if Pujols hadn't just changed teams and signed a giant contract, or if it had happened later in the season. But it's definitely reached the point at which it's a valid cause for concern.
Thanks Ben. Any data on NL hitters moving to the AL? Also, how has Angel Stadium played in April? Seems the announcers referenced a damp, thick air at night in April (heard this during the Orioles series there from Palmer or Thorne)...anything to it?
A game is not just a short streak, it is a really, really short streak. Perhaps a way to do it would be to look at a histogram of AB's between homers. Conceptually, hitting them in streaks would show up as bi-modal... a cluster around the streak, and a cluster around the typical non-streak. You could normalize each player by his average AB's between homers, and test for a trend among all players.
I get that theory. I just don't think it's useful. If a player picks up a homer in his 3rd PA of a game, he's probably only got one more chance to pick up a 2nd HR in that game. So the window of opportunity to see a "bunching" trend is pretty limited.
I understand why it's "less computationally intensive". I just want somebody to do that heavy lifting! ;-)
Good article. Thanks Ben!