BP Comment Quick Links
![]() | |
November 14, 2014 Pitching BackwardDesigning A Bullpen Usage Critique
Last week here at Pitching Backward we took a look at two managers, one who excelled at managing his bullpen and one who really struggled. The data is indisputable, and the analysis sound. There’s certainly a subjective component to it, but comparing the RE24 and inLI ranks for a handful of relievers on any given team is a pretty simple exercise. The problem with everything I wrote last week is that it’s almost entirely wrong. Not wrong in principle, but wrong in theory. It’s easy for us to sit here and say, in hindsight, that a manager didn’t deploy his bullpen resources effectively, but we're looking at averages, drastically oversimplifying the decisions that must be made against the unique battle terrain of each day. While we talk often about the relatively simple decision of when to pull a starting pitcher, we have far less certainty over the subsequent decision: Who should the manager go to? Last week I argued that Fredi Gonzalez should have used Anthony Varvaro in higher-leverage situations. I never brought up the factors that might lead to Gonzalez using David Carpenter instead. To get a sense of all that goes into the decision, here’s a taste of the factors adding branches to a manager’s decision tree: · What is the situation? (score, inning, game importance, etc.) · Who is due up for the opponent? · Are there likely to be higher- or lower-leverage situations after this one? · Should I bring in a left-handed or right-handed pitcher? o When might a LOOGY be best deployed? o How many left-handed and right-handed pitchers do I have? · Is the opposing team a good low-ball or high-ball hitting team? · When was the last time each of my relievers threw? o How many days in a row have they thrown? o How many pitches did they throw over their most recent outings? · How do my relievers feel? o Did anyone have difficulty getting loose recently? o Is anyone coming off an injury? o Does any typically take longer to warm up? o Did anyone have poor command while throwing in the bullpen? · What’s coming up on the schedule that I should be aware of? · How have my relievers been performing? · Am I going to need multiple relievers this inning? · Does this situation align with someone’s role (e.g., setup man, closer, etc.)? o If so, will making an exception have motivational or psychological fallout? That’s a lot of things to consider. By no means is it a complete list either, but simply a place to start when trying to think like an MLB manager. Keep in mind that not only does your favorite team’s manager need to go through all of those factors and pick the “right” guy, he needs to figure all of those things out far enough in advance for his relievers to properly warm up—far enough in advance, in fact, that he hasn’t already blown his desired reliever in a less optimal setting earlier in the game, or even earlier in a series. Let’s dissect some of these in more detail, as we aim to get inside the heads of the managers whose decisions we were so quick to criticize one week ago: What is the situation? Who is due up for the opponent? Should I bring in a lefty or a righty? Not all high leverage scenarios are created equally. A manager might opt to use a lesser reliever in a high-leverage situation in the seventh inning, knowing that he might need one of his better relievers later in the game to face the opponents’ best hitter (or prioritizing the answer to one of the other questions on the list above). When was the last time _____ pitched? Has he pitched on consecutive days? How many pitches did he throw those days? It’s difficult to attempt to automate this kind of decision making; it has much more to do with feel and knowing your pitchers than just about anything else. Since we’re not actually in the clubhouse or manager’s office for every conversation we will probably never be able to accurately model this part of the decision-making process. We get glimpses of it here and there when a manager mentions that someone wasn’t available for some reason, but that’s about it. What does the upcoming schedule look like? Performance That raises the question: How might a manager use recent performance as an indicator? A good manager is constantly reassessing his bullpen, often in granular, dynamic ways that he wouldn’t use in setting his lineup or starting rotation. It’s not uncommon for a reliever to have their role on the staff change multiple times over the course of the season; a clear indication that managers are always evaluating the performance of their relievers. Think of this as the Ken Giles or Zach Britton Effect. At the start of the season neither guy was seen as an elite reliever: Giles started the year in Double-A, and Britton was the Orioles’ long man, throwing two or more innings in five of his first six outings. Each ended the season as one of the best high-leverage pitchers in baseball. If their managers didn’t make necessary adjustments—even on the basis of arguably small samples—these two would have been toiling away in low-leverage situations through the summer, assuming they earned MLB roles at all. Perplexed, I posed a simple question to a colleague, Bryan Cole. I wanted to know how realistic it is for a manager to use recent performance to ‘predict’ a reliever’s next performance. Bryan built out a series of scatter plots that quickly illustrate how difficult it would be to say with confidence that recent performance was especially significant. We selected three relievers (one elite, one middle-of-the-road, and one poor) to quickly take a look at how recent performance predicts the results of a pitcher’s next outing. We plotted the average RE24 posted by each reliever over his previous 15 outings (a completely arbitrary number, but one that seems reasonable for what a manager might consider both “recent” and “substantial”) against their RE24 in their next outing. Here are the scatter plots for those pitchers, with a description to follow.
Every manager or team in baseball is going to have some form of short-term statistical measure that could help inform reliever selections. That said, RE24 is an ideal stat for us to use as a proxy for performance. Because it is calculated using run expectancy for the base states encountered by the pitcher, it gives a rough approximation of how a pitcher performed in a given outing. It can almost be considered a statistical account of what happened in those last 15 outings. Unfortunately for MLB managers, the plots above show that there is little to no correlation between a pitcher’s performance in his more recent 15 games and his next outing. It’s irrational for us to think that a manager should be able to quickly know with any precision when a reliever should move up or down the leverage pole—at least statistically. To a large degree, this becomes a test of a manager’s scouting acumen, another factor we’ll never be able to model. All of this brings us back to the post from last week where I focused on the critical thinking and decision-making abilities of Fredi Gonzalez and Robin Ventura. I lauded Robin Ventura for his ability to recognize the limits of his veteran pitchers and for using less experienced pitchers in key roles. I also criticized Fredi Gonzalez for his abhorrent bullpen usage. But really, honestly, bluntly, we can’t be so confident that the numbers are telling us any perfect truth. The data suggests that Ventura recognized his highly paid veteran relievers were faltering, and that he smartly recognized the potential of his young, talented arms to take over that high-leverage work. And it suggests that Fredi Gonzalez erred by—or at least paid the price for—using David Carpenter in higher-leverage situations than Anthony Varvaro to the bitter end. But given what we’ve looked at here, it doesn’t seem as realistic as it did last week that Ventura would be able to accurately make those types of decisions over the course of a season. It might not even be realistic for us to expect Gonzalez to have reacted to Varvaro’s performance by moving him into a more prominent role in the bullpen. Reliever volatility year-to-year is a huge issue that teams must deal with when trading for or signing bullpen arms. If it’s unrealistic for teams to be able to accurately project performance of a reliever after a full year of performance, then why should we believe that a manager is capable of making those same judgments after just 15 outings? In many ways Ventura did an excellent job managing his bullpen while Fredi Gonzalez, well, not so much. As to designing a model to truly assess—or, even better, predict which moves to make? That’s the challenge. One day we might have a sophisticated model that can quickly and easily identify the optimal reliever to bring in for any given situation in real time. Until then, it’s important we understand and embrace the nuance of these decisions, and keep our knives from getting too sharp.
Jeff Long is an author of Baseball Prospectus. Follow @JeffLongBP
|
Possibly silly suggestion: given that RE24 is a counting stat, if there's any substantial variance in the length of the pitcher's outings, wouldn't that distort the results of the model? Wouldn't it be better to use the last 15 IP (or 20 IP, etc.) if it's a linear model, so that you're using something closer to a rate stat? Not that I think this is going to change the findings at all, the specification just struck me as odd.
It's a fair suggestion, but (at least on those three relievers) it doesn't change the findings at all.
Unsurprising. Thanks for checking.