CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
<< Previous Article
Prospectus Feature: Th... (02/13)
<< Previous Column
Premium Article Pebble Hunting: What I... (01/26)
Next Column >>
Premium Article Pebble Hunting: What t... (02/25)
Next Article >>
Premium Article Going Yard: The Greate... (02/13)

February 13, 2015

Pebble Hunting

Testing PECOTA's Memory

by Sam Miller

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

When I’m asked about a specific PECOTA projection that seems hard to swallow, or about the system in general, I usually point out that PECOTA has something awesome that many of us don’t: A long memory. It doesn’t overreact to the past week, or even the past two years. I do.

This is a strong approach, and it’s why I use PECOTA instead of just making up my own projection for every player based on how I feel. There are two potential vulnerabilities in it, though. One is that PECOTA’s long memory keeps it from very quickly incorporating new, perhaps unique information in an assessment. Simply: We might know that a pitcher’s shoulder seems to be compromised, or that a hitter changed his swing, and we can adapt on the fly. PECOTA, for obvious reasons, can’t, nor can any other projection system. The other is that PECOTA might misunderstand something about that player; by “remembering” this misunderstanding, but being unable to correct it, it might have the same blind spot, or bias, or what have you, every year. Which is why I’m very sympathetic to this idea:

I’ve felt that way at times. I can appreciate why the reaction to the Orioles’ projection has been skeptical. I look at Dan Uggla and remember all the years (it seems) that PECOTA was a bit too bullish on him—at least the past four years running, and this year seems a lock to be the fifth. And so, if I cheered for a team of Dan Ugglas, and PECOTA told me they were going to win 85 games every year, by the fourth year I might quit hoping so hard.

So, now, to the point: Is everything I just wrote right? Does PECOTA take too long to embrace players who have made genuine improvements, is it too slow to throw overboard those who are collapsing, and does it have a “hated [X] the last two years” problem that you (or we) should account for?

We’re going to be dealing with four years of projections and actual results, from 2010 through 2013. (Results from 2014 will also be necessary for one step.) We take all the players who met two criteria in each year: Received a PECOTA projection, and batted at least 275 times, in the same season. Each player who hit 20 points better than his projected True Average is tagged a gainer. Each who dropped 20 points from his projected True Average is tagged a loser.

Twenty points of True Average, just to get a scale in your head, was roughly this last year:

Once we have our gainers and our losers, we look at how they did next year, relative to their next-year PECOTA projection. There are three possibilities: They outperformed (by even one point of True Average); they underperformed (same); or they didn’t make it to 275 plate appearances again, which we’ll call, for these purposes, attrition.

Now, our Persistently Wrong hypothesis goes something like this: If PECOTA is prone to misunderstanding certain players, then we will see a substantial number of players who significantly outperformed their projections in one year outperform their projections the next. We will also see players who significantly underperformed their projections one year underperform them the next.

To the results. There are, on average, about 110 players each year who fit as gainers or losers. The majority are gainers, because gainers are more likely to get the 275 plate appearances needed. Over the course of four years, that gives us 273 gainers, and within that group:

  • 115 outperformed the next season (by an average of 22 points of TAv)
  • 103 underperformed the next season (by an average of 21 points)
  • Three hit their projection on the mark
  • While 52 did not reach our 275-plate appearance threshold

And, of 173 losers:

  • 48 outperformed (by 20 points)
  • 49 underperformed (by 19)
  • Four hit their mark exactly
  • While 71 didn't reach the minimum plate appearances.

We’re not done, so hold your obvious question, but a quick assessment: At first glance, a very subtle effect, with players who outperformed PECOTA once going on to outperform PECOTA twice. Not a big effect at all, and the inverse doesn't show up for the TAv Losers. But something, and interesting enough.

But you’re screaming “survivorship bias” at me, with good reason. Because the second-year over- and underperformers are limited to those who got 275 plate appearances, this process is going to weed out some of the players who sucked in year two. The rate of attrition is itself valuable information, and more of the Losers disappeared from our rolls.

We can’t just look at the rate of attrition overall, though. Players who were defined as Losers after Year One are, based on that Year One performance, probably actually worse than the ones who were defined as Winners. Indeed, PECOTA projects a lower TAv for the Losers as a group than it projects for the Gainers. They are, thus, more likely to wash out of the league in Year Two, just because of who they are.

To deal with this, we’re going to figure out how many players “should” have failed to make it to 275 plate appearance in year two. We’re going to look at all 1,145 players who got 275 plate appearances in one year during that stretch and see how many got 275 plate appearances in the following year. We’re splitting them into groups to get an expected attrition rate for each level of PECOTA projection. Every player who projected to have a True Average below .220, for instance, was a victim of attrition the next year; but very few who projected to have a True Average over .300 were.

That gives us these expected attrition rates:

Projected Year 2 TAv Year 2 Attrition
.200-.220 1.00
.221-.230 0.82
.231-.240 0.51
.241-.250 0.42
.251-.260 0.29
.261-.270 0.20
.271-.280 0.19
.281-.290 0.10
.291-.300 0.11
.300+ 0.06

Applying those to our groups of Gainers and Losers, we get these expected attrition totals, alongside actual attrition:

  • Gainers: 60 Expected, 52 Actual (+8)
  • Losers: 48 Expected, 71 Actual (-23)

An "attrition"—at least, an attrition above normal expectations—is itself an underperformance of one's projection, because it implies that the player was worse than PECOTA projected (and should have had a projected True Average in one of the worse, more attrition-prone tiers). And a lack of attrition (relative to normal expectations) is overperformance. With these included, the case gets more compelling that there has been, within the large subgroup of players who deviated from their projections, a smaller subgroup of players whose talent level truly changed—unnoticed, or perhaps undernoticed, by PECOTA.

(Possible sub-hypothesis: It could be that a player who has, say, a .250 True Average but has failed to live up to expectations (i.e., he's one of our Losers) might be more likely to lose his job than a player with a .250 True Average who is exactly meeting, or exceeding, even modest expectations. The Loser might carry the stench of disappointment, say. Or he might look like he's producing a downward trend line. Or consistency might be a valued trait in front offices, and inconsistency might thus be penalized. Or, as we established up top, maybe his human bosses just have shorter memories. So it's plausible—as a hypothesis—that the attrition rates are skewed by this. Keep that in mind! Humans remain involved in these things.)

Now: If we have found this small subgroup, it is a small subgroup only. Which means if you look at a list of 70 Gainers in a given year, you have to be smart enough to pick the five or so who we could theoretically outsmart PECOTA on. That’s a challenge, particularly when every hitter’s got a story to tell about why he got better, and when every bad-shouldered pitcher is quick to assure you he was just dealing with dead arm last summer but he’s back at full strength. PECOTA's long memory will keep you from making a lot of false-positive mistakes, and if it makes some small amount of false-negative mistakes—well, that's just a reminder that predicting the future is friggin hard. Hence the value of PECOTA, and hence your added value as a smart user who gets to incorporate his own wisdom into applying these projections. I come to you as someone who uses PECOTA constantly, faithfully—and, when the time is appropriate, skeptically. We don’t begrudge anybody the right to carve out his own exceptions, too.

***

Notes: When I get the time, I intend to replicate this with some other projection systems; the point here being not PECOTA specific but projection specific, and PECOTA being merely the projection system nearest to my heart. Also, when I get the time, I intend to replicate for pitchers, because it seems somewhat more likely that we'll find a bigger effect there.

Sam Miller is an author of Baseball Prospectus. 
Click here to see Sam's other articles. You can contact Sam by clicking here

Related Content:  PECOTA

5 comments have been left for this article.

<< Previous Article
Prospectus Feature: Th... (02/13)
<< Previous Column
Premium Article Pebble Hunting: What I... (01/26)
Next Column >>
Premium Article Pebble Hunting: What t... (02/25)
Next Article >>
Premium Article Going Yard: The Greate... (02/13)

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM FEBRUARY 13, 2015
Premium Article Going Yard: The Greatest Swing That Ever Liv...
Prospectus Feature: The Golden Age of Immacu...
Premium Article Rumor Roundup: Yoan Moncada Sets A 10-Day Co...
Premium Article Pitching Backward: How to Outperform Your Pr...
Fantasy Article Fantasy Players to Avoid: Outfielders
Fantasy Article TTO Scoresheet Podcast: Outfielders
Fantasy Article Tale of the Tape, Dynasty Edition: Nomar Maz...

MORE BY SAM MILLER
2015-02-17 - BP Daily Podcast: Effectively Wild Episode 6...
2015-02-17 - BP Unfiltered: How Many Teams Does It Take t...
2015-02-16 - BP Daily Podcast: Effectively Wild Episode 6...
2015-02-13 - Premium Article Pebble Hunting: Testing PECOTA's Memory
2015-02-13 - BP Daily Podcast: Effectively Wild Episode 6...
2015-02-12 - BP Daily Podcast: Effectively Wild Episode 6...
2015-02-11 - BP Daily Podcast: Effectively Wild Episode 6...
More...

MORE PEBBLE HUNTING
2015-03-31 - Pebble Hunting: The Case For Shaming the Cub...
2015-03-17 - Premium Article Pebble Hunting: Would Pedro Martinez Have Go...
2015-02-25 - Premium Article Pebble Hunting: What the Heck, Tigers?
2015-02-13 - Premium Article Pebble Hunting: Testing PECOTA's Memory
2015-01-26 - Premium Article Pebble Hunting: What It Means To Have The Be...
2015-01-23 - Premium Article Pebble Hunting: The Pitchers Who Changed PEC...
2015-01-19 - Premium Article Pebble Hunting: The Hitters Who Changed PECO...
More...