BP Comment Quick Links
![]() | |
October 14, 2016 Flu-Like SymptomsDusting Off a Bill James Postseason Prediction SystemIn his 1984 Bill James Baseball Abstract, the third mass-market Abstract, Bill James introduced what he called “The World Series Prediction System.” Actually, he re-introduced it—the section in the Abstract was entitled “The World Series Prediction System, Revisited.” He’d developed it in 1972 and updated it in a 1982 Inside Sports magazine article that ran shortly before Inside Sports folded. James’ system, he reported, picked 70 percent of World Series winners. His system was a Franken-stat that combined hitting, pitching, and fielding features, assigning points to various metrics, and selected the team with the most points as the likely winner. His system was:
I know some of those weights seem screwy, but that’s how the numbers worked out. He looked at every postseason series and checked how often the winning team exhibited certain characteristics. Shutouts got a weight of 19 because, among the series he considered, the team with more shutouts won 19 more times than it lost. The team with the fewer doubles won 14 more times, the team with the lower relative ERA won 15 more times, etc. And there’s an element of intuitive sense; high-average offenses may be dependent on stringing a lot of singles and doubles together, while scoring in the postseason is often long ball-dependent. As an example of James' system, consider the famous 1969 World Series between the Mets and the Orioles. The Mets had fewer doubles (14 points), more triples (12 points), a lower batting average (8 points), more double plays (7 points), more shutouts (19 points), and allowed more walks (7 points). The Orioles had a better record by nine games (18 points), scored more runs (3 points), had more home runs (10 points), fewer errors (8 points), a lower relative ERA (15 points), and more recent (i.e., ever) postseason experience (12 points). That’s 67 points for New York and 66 for Baltimore. The Mets won the World Series in five games. Now, there’s a significant limitation to James’ system. His formula was printed in the 1984 Abstract, which means he had data through the 1983 season. That’s only 30 Championship Series to analyze (two each from 1969 through 1983), all in a best-of-five format. (The CS expanded to seven games in 1985.) The Division Series didn’t start until 1995, unless you count the oddball split-season 1981 postseason. So James’ system, which is based on actual postseason results, is missing:
That’s a lot of data! So I decided to freshen up James’ formula, using data through the 2015 season. I included only seasons in the divisional-play era, from 1969 to present, for two reasons. First, I think one can make a persuasive case that the game has changed a lot since, say, the 1916 season, when the Brooklyn Robins got 10 points under James’ system for out-homering the Boston Red Sox, 28-14. Second, there’s an argument that the multiple-tier playoff system—Championship Series plus World Series beginning in 1969, with the Division Series added in 1995 and the Wild Card play-in starting in 2012—creates different determinants of postseason success, as fatigue and depth become factors. I also added a few categories that weren’t in James’ initial formula (batter walks, batter and pitcher strikeouts, on base percentage, and slugging percentage), just to see whether they worked out. (By and large, they didn’t.) And I excluded the strike-shortened 1981 split-season. I’ll present the results as a series of questions. Have the weights changed? Yes, they have, by quite a bit. Here are the categories James identified, with their original weights and those calculated by looking exclusively at 1969-2015:
Some weights have changed significantly. For example, when James did his analysis, teams with a lower batting average had done better in the postseason than teams with a higher batting average, by a little. Since 1969, teams with a higher batting average have done better, by a lot. Where applicable, head-to-head record was meaningful; it’s not so much anymore. Hitting more triples was a good thing, now it isn’t. Scoring more runs was mildly positive, now it’s a big positive. But before we go too far with this, let’s move on to our second question. Does the type of series make a difference? Yes, it turns out, it does. James lumped together World Series and Championship Series, because there weren’t many of the latter. He didn’t have sufficient data to break them apart. Since 1969, there have been (excluding the 1981 and 1994 strike seasons) 45 World Series, 90 Championship Series, and 84 Division Series. How do they differ?
That’s a lot of variance. The team with more triples has won the World Series more often, but has been at a disadvantage in the Division Series and Championship Series. Having fewer batter strikeouts and more shutouts are a big advantage in the Division Series, but aren’t much of a factor beyond that. Recent postseason experience has translated into more success in Division Series and World Series, but not Championship Series. Those and other differences, it would seem, augur in favor of different formulae for different postseason series. James' original formula had 12 variables, plus one for intraleague head-to-head records. So let’s develop new formulae with roughly the same number of inputs, based on the table above. DIVISION SERIES:
CHAMPIONSHIP SERIES:
WORLD SERIES:
Two small notes: I found almost no evidence that postseason success related to overall record is scaled by the magnitude of the difference, so I didn’t assign more points, as James did, to teams based on the size of the difference in won-lost records. And I ignored interleague won-lost record for World Series contestants, since the sample sizes, if nonzero, are tiny. Before we see how what the revised system says about 2016, let’s backtest:
That’s pretty good! The 68 percent overall success rate compares to James’ 70 percent reported in the spring of 1984. Let’s apply it to this season. What does the system say about 2016? Well, the system got both ALDS series right, favoring Toronto over Texas and Cleveland over Boston. It got the Giants-Cubs series wrong, assigning 71 points to San Francisco (fewer runs, fewer doubles, fewer walks, fewer strikeouts, higher BA, fewer errors, worse head-to-head) and 52 to Chicago (fewer triples, more shutouts, lower ERA, more recent postseason), but it saw the Dodgers beating the Nationals. For the ALCS, the system gives a narrow edge to the Blue Jays over the Indians. Toronto gets points for fewer doubles, fewer triples, more homers, higher OBP, fewer pitcher strikeouts, and a lower ERA. The Indians get credit for a better record, more runs, higher BA and SLG, and fewer double plays, and the two teams issued the same number of walks. It favors the Cubs (better record, more runs, more homers, higher BA, OBP, and SLG, fewer pitcher strikeouts, lower ERA) over the Dodgers (fewer doubles, fewer triples, fewer pitcher walks, fewer double plays) in the NLCS. In the World Series, there are four possible scenarios. The system likes the Cubs against both the Indians and the Blue Jays. It favors Cleveland over Los Angeles. And the Dodgers over the Jays. We’ll see how the system works as the postseason moves forward. And it goes without saying that any complaints should be addressed to Bill James, c/o Boston Red Sox.
Rob Mains is an author of Baseball Prospectus. Follow @Cran_Boy
|
Was it refreshing to assign stats to fit a percentage goal? While time intensive, seems easier (and a little more fun) than coming up with a brand new statistical model.
Thanks for updating this.
Yeah, lipitorkid, this may be the most unwieldy spreadsheet I've ever developed. Not the largest or most complex by any means, but man, a lot of false starts and error correction and whatnot. Once I had all the data, though, it was easy to figure out what variables to use and what the coefficients to use. The nearly 70% success rate was the last thing I calculated and a nice dividend.