<< Previous Article
Baseball ProGUESTus: A... (05/09)
|
<< Previous Column
Baseball Therapy: What... (05/06)
|
Next Column >>
Baseball Therapy: How ... (05/14)
|
Next Article >>
On the Beat: No-Hittin... (05/09)
|
May 9, 2013
Baseball Therapy
Should I Worry About My Favorite Pitcher?
by Russell A. Carleton
Of course. He's a pitcher.
I've gotten a few requests for this one. Almost a year ago, upon returning to Baseball Prospectus, I posted an update to the work that I had previously done on the reliability of hitting statistics. I had originally written one on pitching stats, as well, but never updated it similarly.
Warning! Gory Mathematical Details Ahead!
As with my piece from a year ago here at BP, I'm updating the methodology that I had originally used. I'm using Kuder-Richardson (formula 21) reliability for binary outcomes and Cronbach's alpha for non-binary outcomes. Data set is 2003-2012 (Thanks, Retrosheet!), with a minimum of 2000 batters faced during those years for each pitcher (unless otherwise noted), meaning that I can see reliability up to sample frames of 1000 PA. For stats that refused to stabilize by a sample size of 1000 PA, I used the Spearman-Brown prophecy formula to estimate the stabilization point.
These numbers represent the point at which each stat reaches a reliability of .70 or greater according to the relevant formula.
Statistic
|
Definition
|
Stabilized at
|
Notes
|
Strikeout rate
|
K / PA
|
70 BF
|
|
Walk rate
|
BB / PA
|
170 BF
|
IBB's not included
|
HBP rate
|
HBP / PA
|
640 BF
|
|
Single rate
|
1B / PA
|
670 BF
|
|
XBH rate
|
(2B + 3B) / PA
|
1450 BF
|
Estimate*
|
HR rate
|
HR / PA
|
1320 BF
|
Estimate*
|
|
|
|
|
AVG
|
H / AB
|
630 BF
|
Min 2000 AB's
|
OBP
|
(H + HBP + BB) / PA
|
540 BF
|
|
SLG
|
(1B + 2 * 2B + 3 * 3B + 4 * HR) / AB
|
550 AB
|
Min 2000 AB's, Cronbach's alpha used
|
ISO
|
(2B + 2 * 3B + 3 * HR) / AB
|
630 AB
|
Min 2000 AB's, Cronbach's alpha used
|
|
|
|
|
GB rate
|
GB / balls in play
|
70 BIP
|
Min 1000 BIP, Retrosheet classifications used
|
FB rate
|
(FB + PU) / balls in play
|
70 BIP
|
Min 1000 BIP including HR
|
LD rate
|
LD / balls in play
|
650 BIP
|
Min 1000 BIP including HR, Estimate*
|
HR per FB
|
HR / FB
|
400 FB
|
Min 500 FB, Estimate*
|
BABIP
|
Hits / BIP
|
2000 BIP
|
Min 1000 BIP, HR not included, Estimate*
|
What Do I Really Know About You?
Well, we quickly have an idea of strikeout rate, ground ball and fly ball tendencies, and (somewhat less quickly), walk rate. Over a season, you can get a pretty good idea of a pitcher's single and HBP rates. Strangely enough, singles stabilize a lot faster than the alleged "true" outcome of HR rate. Some of the classic one-number rates (OBP, SLG) can stabilize over the course of a year for a full-time starter. And yes, BABIP still needs a lot of data (roughly 2000 balls in play), but that number is actually about half of what I had once estimated elsewhere.
All Numbers Tell a Story, But it Might Not Be The Story You Wanted to Hear
I'm well aware of the fact that most of the requests for these analyses came from people who were trying to get a feel as to whether their favorite pitcher having a bad year (read: Roy Halladay) was just having a bad couple of games or whether his performance was "real." I want to (again) point out that the way in which I most often see these numbers used is not exactly what they’re meant to show.
When I say that strikeout rate for pitchers stabilizes at 70 batters faced, what I mean is that we can be reasonably sure that his strikeout rate over those 70 batters is a good reflection of his talent level over those 70 (now past) plate appearances. This is different from saying that once a pitcher has gotten to 70 batters, we can assume that he will perform this way for the rest of the season. That's an assumption. It's not a bad one, but it is an assumption. Instead, what it means is that if his underlying skill set has changed in some meaningful way, we'll know in 70 plate appearances.
Also, I'd caution people against treating these numbers too dogmatically. 70 plate appearances is not a magic number. It's the point where a measure of reliability slowly crosses an only-somewhat-arbitrary line in the sand. At 139 PA, the reliability for strikeout rate is just shy of .70, and you need to have just a shade less confidence in any proclamations that you make using those 139 PA.
Russell A. Carleton is an author of Baseball Prospectus.
Click here to see Russell's other articles.
You can contact Russell by clicking here
<< Previous Article
Baseball ProGUESTus: A... (05/09)
|
<< Previous Column
Baseball Therapy: What... (05/06)
|
Next Column >>
Baseball Therapy: How ... (05/14)
|
Next Article >>
On the Beat: No-Hittin... (05/09)
|
Great article. You had me at "Instead, what it means is that if his underlying skill set has changed in some meaningful way, we'll know in 140 plate appearances."
I had a football coach who knew just enough about statistics to be dangerous. He discovered we scored 100% of the time when we attempted a td pass from between the 25 and 30 yard line. A running back made a break into wide open field on 2nd and 5 from the 40 and he made him run out of bounds at the 26 with the threat of death. Next play, the 100% productive td pass from between the 25 and 30 was picked off.
I looked at him like he was stupid, but a friend of mine made a good point. Even with the pick the play still saw a success rate of 66.7%.
Stats rule, in context.