BP Comment Quick Links
![]() | |
September 8, 2010 Manufactured RunsSolving the Mays Problem
So, we’ve been talking about revising the metrics we use here at Baseball Prospectus—I’ve described a fielding metric and a complementary batting metric. So now let’s go about discussing some of the ways they fit together. One of the big things we need to do when we build all-encompassing metrics is adjust for position. That’s because of the way we construct our metrics—we have offensive metrics that compare players to all other players, but defensive metrics that compare players only to other players at that position. That makes it difficult to compare two players who play vastly different positions. Baseball fans of course know this intuitively—if you have a first baseman and a shortstop with the same batting line, the shortstop is likely the better player. Given the nature of the problem, the most straightforward solution would seem to be to compare players’ batting to their peers at the same position. And for the most part, that approach works well.
The Mays Problem But you run into a problem in some extreme cases. Willie Mays is one of those cases. See, the thing about Mays is, he could have walked into Cooperstown wearing a first baseman’s glove. He was simply an astonishing hitter. It’s just that he could play an excellent defensive center field as well. The problem with analyzing Willie Mays is that he was just so superlative that he moves the baseline to which he’s being compared. And I call it the Willie Mays problem, but it wasn’t Willie Mays alone causing it. Mays had a lot of help. Looking at the top players in games played in center field in the 1950s:
(Bold indicates players in the Hall of Fame.) That’s a really impressive list of baseball players. You have Mays and Mantle of course—Mantle, like Mays, is a guy whose bat was impressive for any position. Ashburn, Snider, and Doby were also incredible ballplayers, though, and prolific during the '50s. See, if we use raw positional averages, we end up asserting that the average center fielder is as good as the average corner outfielder. In the '50s, that wasn’t true—the average center fielder was a better defensive outfielder than the men in the corners, but he was also a better hitter than them. So in those cases, using positional averages for offense falls short. What else are we to do? We can’t simply increase our sample size to wash out the noise—as I said, even in a 10-year sample, we don’t see it wash out.
Adjusting Defense What Tom Tango has proposed, and has been adopted in much of the prevalent Wins Above Replacement metrics outside of Baseball Prospectus, is adjustments based upon comparing fielding stats of players who play multiple positions. This is going to do OK on the Mays problem—at least the one actually involving Mays—but I feel that it introduces quite a few problems of its own along the way:
In other words, what you are doing is analyzing a much smaller (and biased) population with much cruder analytical tools instead of analyzing a very large sample, all players, with very sharp analytical tools, like modern offensive metrics. The larger problem you come to is that the distribution of defensive talent shifts over time. Comparing position switchers is a very crude way to track that—you need a lot of years of data to do that sort of analysis and you need to manually intervene in some of the analysis yourself. So it’s very hard to see where those shifts are occurring and include those in the positional baselines. OK, but do those problems present themselves in such a way as to cause problems with our comparisons of players? I think they do, and it’s along the boundary between second and third base and between left and right field. For second and third base, we have over 50 years of data that says that third basemen practically always outhit second basemen. And yet looking at the position switchers in terms of fielding data, what you see is they’re basically even. What are the possibilities here? Well, the first is that there is always a Willie Mays problem at third base—always a cluster of absurdly talented Hall of Famers at the hot corner biasing our evaluation. That seems rather unlikely, given that there are no players in common between, say, the '50s and the '90s, and yet we always see the same pattern. The next possibility is that baseball teams are just doing a poor job of allocating talent, and that they are needlessly diluting the second-base talent pool by keeping a greater portion of the good players at third base. I don’t see any hard evidence for that contention, and it doesn’t seem to be particularly reasonable. The third possibility is that we’re simply missing something—that the model is failing to represent the underlying reality. To be perfectly blunt, I think the responsible way to do sabermetrics is to be very careful in asserting that it is our model, not reality, that is correct when the two are in conflict. And the way actual baseball teams behave, it seems like the defensive responsibilities at second are harder, and therefore teams are more likely to put better defenders there than at third. And again you see the same thing with the corner outfield spots—although interestingly enough in the late '70s you see a shift, where before then the left fielders were generally the better hitters and after that the right fielders generally were. So can we solve the Mays problem while still using offensive adjustments? Outliers At first blush, the Mays problem really looks like a simple (and common) problem in determining the average of a population—outliers. The arithmetic mean, the most common form of average, is very sensitive to outliers. There is a lot of good existing research on how to use more robust measures of central tendency—the median, truncated means, log transformations, and so on. Imagine my frustration when I discovered that none of them were effective. The problem? Willie Mays (and the superlative players) aren’t the only outliers. They’re not even the biggest outliers. What you tend to find out is that the biggest outliers at a position are typically the worst, not the best, players. That’s almost entirely a function of the data set—when you’re looking at your very sub-marginal players, what you’re not seeing are the guys just like him who are playing in the high minors. And so if you’re trying to reduce the influence of outliers, what you do is you end up curtailing the effects of the below-average players as much as you do the exceptional players, and it washes out. (Actually, you tend to raise the positional averages more than you do to lower them). What I’ve done is to take and split the sample in half—above and below average. Then I looked at the distance between the mean of the two halves. If there is very little skew, the halfway point between those two is going to be the same average I used to split the dataset. But that’s rarely what we see—we see one half with a larger distance from the average and fewer representative plate appearances. The weighted difference of the two halves will give us the initial average, but the unweighted difference is an estimate of how the skew is tilting the average. So I used the unweighted difference to “shift” the average, and then applied an average amount of skew-related difference back to each position so that the positional averages will add back up to the total league average. This process is, as one might imagine, rather unstable over a single season. But over a period of nine years it stabilizes quite nicely (I chose nine because it let me focus on the four seasons before and after the season of interest, as well as the season itself.) That isn’t to say that the picture is entirely clean—looking at corrected runs per plate appearance by position over the years:
You still do see a lot of shifts (little and big) over the years. Some of them may in fact be just noise. Others may be teams shifting talent around as conditions change. I mean, honestly—I wish it was cleaner, I do. But baseball analysis can get messy sometimes, and I think this is one of those times. Best to acknowledge the mess, rather than trying to put some throw pillows over it and act like its not there.
Colin Wyers is an author of Baseball Prospectus. Follow @cwyers
|
Nice article. Position adjustments are a hairy issue, but they're so critical to everything.
...
Do you think the "Mays Problem" as you describe it is really an issue of outliers? Or is it just an overall talent gap between positions? Teams often start most players out of high school and college at the far left of the defensive spectrum (SS if righty, CF if lefty) and then move them to the right when you decide they can't play that fielding position adequately.
In other words, doesn't your method still assume, in the end, that total position talent (offense + fielding) is constant across positions? I've always felt that assumption was the biggest problem with offense-based position adjustments, because I don't think that's likely to be the case. 2B, in particular, seems to be a position to which players are moved to when they can't field well enough to be a SS, but you're also not a particularly good hitter. I'll accept that they should be better fielders than 3B's (skills should be different, of course), but I do tend to think that 2B is a below-average position in terms of talent.
Here are data from Tango's fan scouting report last year. No position-specific weights, just overall averages of skills (rated 1-5) across positions:
CF 3.7
SS 3.6
2B 3.4
3B 3.3
RF 3.3
C 3.0 (weird position, though, so not apples to apples)
LF 3.0
1B 2.98
This seems about right to me in order and size of gap between players, maybe with the exception of CF's over SS. I wonder if this kind of data, though, is the better solution to all of the various issues you've raised with position switching data than going back to offense-based adjustments. Obviously it only applies to modern baseball, but it's a start.
Cheers,
Justin
"2B, in particular, seems to be a position to which players are moved to when they can't field well enough to be a SS, but you're also not a particularly good hitter. I'll accept that they should be better fielders than 3B's (skills should be different, of course), but I do tend to think that 2B is a below-average position in terms of talent."
Okay, but you could say that about most positions - guys who can't field well enough to play short move to 2B, 3B or CF, depending on their other talents. We know that the ones that move to 2B tend to hit less well than the ones that move to 3B or CF.
What we're left wondering is why? Is it because teams have just decided to put the worst of the failed shortstops at 2B? I don't think so - it makes no sense, and I tend to require a preponderance of evidence before I believe teams are behaving in a wholly irrational fashion.
Well, we know why the CF pool tends to hit better - it's where you put the guys who failed at being shortstop on account of being left-handed. So it really comes down to a question of 2B and 3B.
To speak broadly - the guys who move from SS to 2B are the guys with the range to play SS but not the arm. The guys who move to 3B are the ones with the arm but not the range. (This makes one wonder about who the players are who are represented in the position switcher pool, doesn't it?)
What I'm not doing is requiring the positions (offense and defense) to be equal. If you take the average center fielder from 1955 (without correcting for skew, as I have) in a full season, in this system he'll probably be +5, assuming neutral defense. A left fielder is going to be more like a -2. So you aren't forcing everything to zero - center fielders as a group were more valuable than left fielders that year, and you're capturing that difference.