BP Unfiltered

Classifying Park Factors for DRA

by Jonathan Judge

Printer-friendly

Contact Author

Jared Cross, the esteemed author of the Steamer projection system, asked a very good question over at Tango's blog that I wanted to address in more detail. Referencing Deserved Run Average (“DRA”), our new descriptive pitcher value metric at Baseball Prospectus, Jared asked:

I’m wondering - why treat catchers as random effects but parks as fixed effects? I think that could be defended in terms of computation time but I don’t see the theoretical distinction between catchers and parks as elements of the context?

For background, some of you may recall that DRA is a sequence of two models: a linear mixed model that determines the core “value” of a pitcher, adjusting for various confounders, and then a MARS model that, among other things, considers additional factors beyond batting events and scales the “value” ratings in accordance with MLB rules, to award pitchers the same ratio of runs they would receive under the prevalent runs per nine innings system. Both park factors and the effect of catchers are controlled for in the value model.

A mixed model (or multilevel model, if you prefer that term) divides relevant predictors into two groups: (1) fixed effects and (2) random effects. Fixed effects are estimated as parameters as in a traditional regression, while the random effects estimate the most likely values of the variables in those categories conditioned on the underlying data and the parameters values first determined for the fixed effects.

The line between what constitutes a fixed versus a random effect is hazy at best. (Bayesians would probably say it is non-existent). Traditionally, practitioners tend to see fixed effects as containing all information of interest and thus “fixed” at predefined levels, whereas random effects tend to have a theoretically unlimited number of levels and thus constitute a mere random sample from a larger population. Certain aspects of baseball — like the number of base-out states — are fairly easy candidates for fixed effects, while baseball participants (including catchers) are more fairly considered random effects since there is a theoretically unlimited combination of abilities and resulting players. (Players are also more fairly considered as random effects because if you try to model them as fixed effects, your software will probably crash). So, for these reasons, catchers are definitely random effects and I don’t think most people would disagree.

But park factors are more complicated. There are only 30 parks, so park factors are, in that sense, a firm and fixed category. On the other hand, the factors that affect scoring in parks are constantly changing. Thus, one could envision an unlimited range of park characteristics that vary from year to year, and even game to game, and a particular time period as being a random sample of a larger range of options.

From a practical standpoint, there are also advantages to treating park factors as a random effect versus a fixed effect. When you make something a fixed effect, the model absorbs one level of each variable to merge into the intercept, meaning that only 29 out of 30 parks actually have their own park factor, and they are all calculated relative to the park selected to be the baseline. A random effect, however, by definition operates at a different level and has its own intercept, typically centered at zero. Thus, making park factors a random effect allows you to assign all 30 parks a plus/minus run value above or below a natural zero midpoint.

This season, we have treated park factors as fixed effects, and have been satisfied with the results. But as we continue to tweak and refine DRA, I wouldn’t be surprised if the best treatment of park factors got reconsidered at some point in the future, for convenience purposes if nothing else.

Jonathan Judge is an author of Baseball Prospectus. Follow @bachlaw
Click here to see Jonathan's other articles. You can contact Jonathan by clicking here

Related Content: DRA

4 comments have been left for this article.

<< Previous Article

Daisy Cutter: Sonny Hi... (09/24)

<< Previous Column
BP Unfiltered: One Par... (09/09)

Next Column >>
BP Unfiltered: Scoutin... (10/05)

Next Article >>

Rubbing Mud: Intention... (09/25)

RECENTLY AT BASEBALL PROSPECTUS Playoff Prospectus: Come Undone BP En Espanol: Previa de la NLCS: Cubs vs. D... Playoff Prospectus: How Did This Team Get Ma... Playoff Prospectus: Too Slow, Too Late Playoff Prospectus: PECOTA Odds and ALCS Gam... Playoff Prospectus: PECOTA Odds and NLCS Gam... Playoff Prospectus: NLCS Preview: Cubs vs. D...	MORE FROM SEPTEMBER 25, 2015 What You Need to Know: September 25, 2015 Prospectus Feature: The Incredible Yogi Fantasy Rounders: Boyd Meets World The Prospectus Hit List: September 25, 2015 Rubbing Mud: Intentional Scioscia Balls Free Agent Watch: Week 26 Fantasy Starting Pitcher Planner: Week 26
MORE BY JONATHAN JUDGE 2015-11-25 - Prospectus Feature: Updates to FRAA, BP's Fi... 2015-11-09 - Prospectus Feature: Passed Balls and Wild Pi... 2015-10-12 - Prospectus Feature: DRA and Linear Weights. ... 2015-09-25 - BP Unfiltered: Classifying Park Factors for ... 2015-09-08 - DRA and the Cy Young Award 2015-08-26 - DRA Run Values 2015-06-10 - Prospectus Feature: DRA: Improved, Minused, ... More...	MORE BP UNFILTERED 2015-10-21 - BP Unfiltered: A Little More on Cubs' Issues... 2015-10-13 - BP Unfiltered: Why Didn't The Royals Steal H... 2015-10-05 - BP Unfiltered: Scouting Ichiro as a Pitcher 2015-09-25 - BP Unfiltered: Classifying Park Factors for ... 2015-09-09 - BP Unfiltered: One Particular View of Kris B... 2015-07-31 - BP Unfiltered: The 2015 Trade Deadline Trans... 2015-06-05 - BP Unfiltered: Everything You Could Have Lea... More...