Caught Looking

A Bayesian Approach to Fantasy

by Michael Wenz

Printer-friendly

Caught Looking examines articles from the academic literature relevant to baseball and statistical analysis. This review examines Daniel Herrlin’s Ph.D. dissertation at San Diego State University and Claremont Graduate University.

The title of my Ph.D. dissertation was just five words long: Casino Gambling and Economic Development. A nice thing about the topic and the title was that my non-academic friends felt comfortable chiming in with their opinions. A downside of the topic and title was that, well, they also all felt pretty comfortable with their opinions.

Daniel Herrlin chose a longer title for his dissertation, “Forecasting MLB Performance Utilizing a Bayesian Approach in Order to Optimize a Fantasy Baseball Draft,” but I don’t suspect it will be enough to overcome the commentary of friends and casual acquaintances who are sure they know better. And I can’t imagine the abuse he’ll take if he finishes fourth in his league this year.

The dissertation begins with an examination of batting order construction. It’s not obvious at the outset what this has to do with fantasy baseball, where we’re subject to the whims of real-baseball managers to fill out the lineup card, but a flowchart promises that we’ll find out later. A big part of the problem, it seems, with making decisions about batting order is that it takes a very long time to determine an optimal batting order via simulation, from a day to possibly weeks if more than nine players are in consideration.

Herrlin’s system uses a Markov Chain approach that adds a significant level of complexity, especially through improved consideration of baserunning, but employs a greedy algorithm to reduce computational time. I expect greedy algorithms are familiar to computer science folks like Herrlin and his dissertation committee, but they weren’t to me. Essentially they use some simple and reasonable rules to narrow the set of lineups considered, which leads to the unlikely possibility of missing the very best lineup possible. Technically, the algorithm could converge on a local, rather than a global maximum, but Herrlin uses randomization and some other tools to reduce (but not quite eliminate) this possibility.

The system also produces some interesting results. Notably, the recommended first hitter in the lineup is usually bad, generally the seventh-best hitter in the lineup. On-base percentage was the most important driver of lineup position, and the best hitters generally wound up in the two through four slots. The worst hitter batted ninth, and while this would seem to imply that pitchers should hit ninth, it’s really an artifact of a bad hitter hitting first. If you choose to put a better hitter in the leadoff slot, it’s probably no longer true that the worst hitter should hit ninth. Additionally, while the lineup tended to bunch the best hitters in the 2-5 slots, simulated lineups with three sluggers split the third slugger out toward the end of the lineup to create two mini-lineups.

Next, Herrlin moves on to creating a model for player performance. He begins with hitters, first constructing the likelihood that an at-bat will end in a walk, single, double, triple, homer or out. Crucial to modeling hitters, especially for fantasy purposes where identifying breakout candidates early in their career brings large returns, is to recognize that simply extrapolating available performance data is insufficient. Observed performance is due partly to luck, and partly due to skill, so a Bayesian method is applied to project future performance. Like with greedy algorithms, I wish there was a bit more intuition provided for Bayesian methods generally, and for the Dirichlet distribution used for the Bayesian priors, but the narrative focuses on evaluating the success of its predictions. With a Bayesian approach, projections are based in part on the player’s observed performance and in part on an assumed underlying distribution of player ability, with the former getting more weight as the observed sample size rises. This approach is consistent with some of the best literature out there.

In addition, Herrlin models aging curves with a focus on isolated power to provide the link. Aging curves are constructed at the individual level, but constrained by some tests for reasonableness. Age and ISO are also used, along with some other pieces, in a regression tree approach to forecast breakout or regression candidates. Finally, baserunning is added into the model, z-scores are created for each statistic to create a common scale, and players are given a ranking.

So how did the model do? The goal is to use the method to build the best fantasy team possible, so success is measured by whether the system can identify players who will outperform their draft position. To do this, you need a benchmark, in this case the rankings of Rotoworld and Athlon Sports. If the system works, the players that are ranked higher than the comparison draft lists will perform better than their draft slot, while players ranked lower will perform worse. By this criteria, the system did quite well. The model disagreed with Athlon and RotoWorld 38 times, and was superior to them in 26 of the 38 instances.

Pitchers came next, and the approach is essentially the same, albeit with different categories and some consideration for what to do with relief pitchers and the oversize fantasy importance of the save statistic. Aging curves were dropped for pitchers, however, as there is much less consistency in the literature about exactly how predictable pitcher aging effects really are. This time around, the model outperformed Rotoworld, and roughly matched Athlon’s rankings.

Finally, Herrlin constructs a drafting algorithm that considers a number of different dimensions. Projection matters, but the algorithm is not based on pure projected performance rank. There are considerations for player variability, position and projected draft position. These combine to produce really not one, but three drafting strategies. In the early rounds, the focus is on finding people with high performance and low variability, without reaching. Basically, put down your building blocks first. In the middle rounds, fill in positional gaps and grab people who are under-ranked by the consensus. This is where he grabs the players who he ranks significantly better than the competition and where positional flexibility exerts extra influence. Finally, in the later rounds, look for high variability in hopes of unearthing a sleeper or two.

Herrlin simulates a draft based against the Rotoworld and Athlon rankings guides, and at first blush, it looks like his system does quite well. It’s not entirely clear, however, how the counterfactual rosters are chosen and it’s probably unfair to assume that he’s the only one making positional scarcity adjustments. Position runs are common in drafts, but I don’t think his simulation considers that. Still, he’s grabbing guys well ahead of their consensus rankings and finding good players based on his own rankings in the later rounds so it appears to work.

In the end, Herrlin has constructed a wide-ranging and thorough model to construct a fantasy baseball team. I have a few minor quibbles. It’s a little long on evaluation of results and a little short on intuition throughout for my taste, but he should write what his dissertation advisor wants, not what I want, so I can’t fault him for that. Also, I wish his best-in-class forecasts would have included a crowdsourced metric such as ESPN’s average draft position rather than just Athlon and Rotoworld. These are small points, though. Generally, Herrlin has given due consideration to all the important elements of projecting fantasy baseball results and constructing a roster. The Bayesian approach is at the frontier of knowledge, and the drafting algorithm is sensible.

I hope he wins his league this year. It will make the conversations go much easier with the people who will inevitably ask about his dissertation.

Michael Wenz is Visiting Professor at Politechnika Czestochowa in Poland. Comments and suggestions for future articles are appreciated.