BP Comment Quick Links
![]() | |
February 10, 2010 Introducing SIERAPart 3Earlier this week we introduced the run estimator SIERA, providing a general summary of its purpose as well as the evolution of its development. Today, in Part 3, our focus will shift to the quantitative side of the metric, offering a detailed look at the data used to derive the formula as well as specifics pertaining to the regression analysis techniques used. The transparency should provide a better understanding of the integrity of such a process as well as a few insights into the SIERA-laden approach towards pitcher valuations. The Data All data used throughout this process, be it the calculation of SIERA or the various other comparative estimators, came from Retrosheet, a monumental achievement in the world of data without which several advancements in the field would not exist. The first step involved extracting seasonal tallies from the main events table, with statistics being grouped by pitcher, team, and year. This way, a pitcher with stints on various clubs throughout the same season would carry a different entry for each; Cliff Lee as both a Phillies and Indians pitcher last season. Next, using the Lahman database, the pitching park factor was added to each row in the table. Park-adjusted ERA was then calculated, though only half of the park factor was applied to the individual pitchers given that only half of a team’s games are played at the home stadium. If a pitcher ended up with a PPF of 105, instead of taking 95 percent of his ERA, 97.5 percent was taken, equating to one-half of the difference between the actual park he called home and one considered to be neutral. With the adjustment applied to raw ERA, the next issue to bypass involved batted ball reliability. While Retrosheet provides a fantastic wealth of information, batted ball data is realistically only usable from 2003-present. The major reason for this involves how balls put in play were scored, as the processes implemented have not been consistent. Before 2003, batted balls were only recorded on outs, meaning that a ground ball single through the third base hole counted as a single while a ground out to the second baseman went down as a grounder. Both are ground balls, but this rather vast issue precludes the usage of batted balls prior to that season. Only data from 2003-09 moved onto the next round given this restriction. With that table in place, the QERA formula was unfoiled and the nine emerging terms were calculated for each row in the database table. The data was then ready for further processing and rigorous study. The Results SIERA was first estimated with 10 parameters: an intercept and the nine aforementioned terms that surface once QERA is unfoiled, which involved regressing park-adjusted ERA on all nine terms. The results can be seen below: VARIABLE COEF. T-STAT P-STAT Constant 6.368 16.97 0.000 SO/PA -18.341 -7.10 0.000 BB/PA 9.471 2.00 0.046 (GB-FB-PU)/PA -1.807 -1.60 0.110 (SO/PA)^2 10.254 1.98 0.048 (BB/PA)^2 6.833 0.33 0.742 ((GB-FB-PU)/PA)^2 -7.063 -3.93 0.000 ((GB-FB-PU)/PA)*(SO/PA) 9.661 2.38 0.017 ((GB-FB-PU)/PA)*(BB/PA) -3.208 -0.44 0.661 (BB/PA)*(SO/PA) 2.828 0.18 0.857 Before getting into what the data originally said, a description of the columns is in order. The first column lists the variable in question while the coefficients were estimated by the regression. The t-statistic describes how many standard deviations from zero the coefficient strayed and the p-statistic tells us that, if the effect of the variable on park-adjusted ERA were actually zero, what the probability is a coefficient that far from zero would surface. It is commonly accepted that p-stats less than .05 or .10 are probably different from zero. Unfortunately, reliable data for balls in play only exists from 2003-09, which means that we are unable to get many coefficients that make sense to be significant. Our intuition helped to build this model, with an understanding that as pitchers get back on the mound and throw some more games even more accurate results can be had. Note that the above table does not show the final formula for SIERA, but rather the original estimation using the entire formula for QERA regressed on park-adjusted ERA. Also note that the data used to build the table above originally came from 2003-08, not 2009; the latter year was excluded for the purpose of eventually testing a regression on an outside element. However, to contrast it with the table below, the table above includes 2009 data as well even though our original tests left out 2009 data for honest testing procedures. What immediately stands out is that the quadratic term for walks is not significant, the .74 p-stat indicates that there is a 74-percent chance that you would get a value further from zero than 6.833 if the true quadratic effect of walks on ERA was zero. The conclusion: the effect of walks on ERA is linear but perhaps with interactions with strikeouts or ground balls. It is also evident that the effect of strikeouts and walks is non-existent as well. This seems plausible, seeing as there is no reason to assume walks increase ERA more for high strikeout pitchers as opposed to those with low whiff totals. Two quadratic terms are significant as is an interaction term. The interaction between walks and ground balls could have been dropped, but intuition chimed in and kept it afloat because the significance of the interaction of strikeouts and ground balls forces honesty and requires the presence of the former interaction. The reason this interaction is believed to be clinically significant is that pitchers who strike more batters out allow fewer singles and need fewer double plays. This is true for walks as well. Removing the other two insignificant terms sends the walk and ground ball interaction term closer to significant, but still far from it. It is our belief that including this interaction gives a more accurate prediction of a pitcher’s skill level and that the reason that the coefficient is insignificant is that the sample size is too small. Some of the other effects are even crisper when the regression is analyzed with the two insignificant terms removed: VARIABLE COEF. T-STAT P-STAT Constant 6.262 28.07 0.000 SO/PA -18.055 -8.39 0.000 BB/PA 11.292 12.81 0.000 (GB-FB-PU)/PA -1.721 -1.57 0.116 (SO/PA)^2 10.169 1.97 0.049 ((GB-FB-PU)/PA)^2 -7.069 -3.94 0.000 ((GB-FB-PU)/PA)*(SO/PA) 9.561 2.38 0.017 ((GB-FB-PU)/PA)*(BB/PA) -4.027 -0.58 0.563 Four terms are worthy of further explanation as they are significant, or close enough to significant, like in the case of the linear term in (GB-FB-PU)/PA since its square proved to be significant. Each will be explained separately:
Thus, these four points have shown us that strikeouts have a diminishing return as you accrue more of them, ground balls have an increasing return the higher your tally, and ground balls are more beneficial to pitchers who allow more walks or balls in play, especially because fly balls are more detrimental to pitchers who allow more runners on base. How beneficial are these results? In Part 4 of our introductory series on SIERA, the estimator will be put to the test at both predicting same year ERA better than other estimators that use similar statistics and at predicting future year ERA than any other estimator out there.
Matt Swartz is an author of Baseball Prospectus.
|
Cool stuff. Just a thought I had-- Have you tried using (BB-IBB)/PA as your walk rate variable (and maybe using another variable that is (IBB/PA)? Presumably all intentional walks are done to reduce the number of runs allowed, while in your regression all walks will increase ERA equally. Then again, I assume the scarcity of intentional walks will make this addition insignificant.
Thanks. We did play around with IBB a little bit, but some of the problem is that it is difficult to differentiate between IBB where the pitcher gives up after getting into a 2-0 or 3-1 count and direct IBB from the first pitch, and then to separate even further the difference between those IBB and just pitching around people.
There certainly was some indication that IBB led to fewer runs, particularly with respect to the ground ball term, but at this sample size we figured it was probably best not to do something that could be construed as data mining. We also felt that the gains from distinguishing between BB & IBB seemed negligible anyway. That is a good point, though. Thanks for highlighting it.
The BB and IBB discussion is one Matt and I had for a long, long time, but we ultimately felt that the difficulty in differentiating the types of IBBs muddied the waters and for now just felt more comfortable using the term in its current state. But it is definitely something we were conscious of throughout this process.