CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
<< Previous Article
Premium Article California League Tour... (07/21)
<< Previous Column
Premium Article Checking the Numbers: ... (07/16)
Next Column >>
Premium Article Checking the Numbers: ... (07/28)
Next Article >>
Transaction Action: AL... (07/21)

July 21, 2010

Checking the Numbers

To Subtract or Divide

by Eric Seidman

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

In this day and age, baseball players are defined by their statistical attributes much more than they were a few decades ago. That isn’t to say that stats rule all by any means, but rather that teams are starting to be built with more of an eye toward numbers than in the past or at least with an eye toward numbers that provide more information. We have witnessed the defensive revolution. This past offseason, not only did the Red Sox make a conscious effort to bring aboard the darlings of fielding metrics—Mike Cameron, Marco Scutaro, and Adrian Beltre—but teams shied away from the likes of Jermaine Dye, who averaged 33 home runs and a .279/.347/.528 line over the last four seasons, because his overall contributions were not in line with his asking price. And last offseason, the glut of hard-hitting but poor-fielding corner outfielders suffered financially; it’s hard to imagine players with skill sets similar to those of Adam Dunn and Bobby Abreu being offered so little even just a few years ago.

Simply put, with decisions hinging upon some of these numbers, it is imperative that users of the information not only utilize the appropriate toolkit but that they develop a solid understanding of why certain metrics are used. You don’t want to bring a knife to a gunfight, but on a more granular level, it also isn’t smart to bring a butter knife to a cleaver battle if such things exist. My favorite television show growing up was "The X-Files," so it should come as no surprise that my goal as an analyst has always been to spread the truth in whatever way possible. My goal today is to use a topic I recently wrote about as well as a couple of the ensuing comments to revisit what numbers we might use in a specific situation as well as why that specific number is used.

Over the last two weeks I have written about Cliff Lee’s strikeout-to-walk ratio, breaking down the metric itself, comparing his current rate to the single-season highs over the last few decades, and comparing his current rate to the rates of others through a similar point in the season. The articles found that nobody has ever had a K/BB ratio as high as his currently stands this deep into the season and that it would take a relative implosion—given where he is, a ratio of around 4.50 would be considered implosive—for Lee to not break the single-season K/BB record with a 150-inning minimum set by Bret Saberhagen at 11.00 in the strike-shortened 1994 campaign.

In the more recent piece, I took the 10 highest rates at a similar point in time and found their respective rates from that point forward. David Wells, at 14.50 on June 27, 2003, had the highest non-Lee rate, with Curt Schilling’s 13.64 on June 14, 2002, coming in second place. The dichotomous nature of how these two pitchers arrived at their rates tended to prove the shortcomings of the K/BB ratio in general, as Wells’ 14.50 consisted of 58 strikeouts and four walks, while Schilling whiffed 150 and issued just 11 free passes. Wells avoided walks but so did Schilling, and the latter punched out three times as many hitters. In other words, Wells posted the higher rate, but the inputs to Schilling’s rate added more value.

Value is the key, as the goal of most evaluators and analysts is to advise or discuss players in terms of the value they can add to a team. Wells issued a minuscule number of walks, but Schilling prevented many, many more hitters from doing harm to his team by preventing them from even making contact. Value can be tough to measure as well because some of the more telling numbers are not stable, meaning that they fluctuate from year to year. A pitcher with a 3.65 ERA in 2009 isn’t a lock to come anywhere near that mark in 2010, and so predictivity—a word I’ve wanted to use for a while but over which I was afraid to bypass the red squigglies—plays a major role in assessing value.

A player who has a solid season derived from numbers likely to repeat is more valuable than, say, Kyle Kendrick’s 2007 season, when he posted a sub-4.00 ERA but ended up with very poor peripherals. All of which brings us back to the Wells vs. Schilling conversation that surfaced in the comments of my articles. Several readers pointed out the very fact that Schilling produced a lower rate but added more value. It was also noted that the correlation of K/BB from one half of the season to another across the 10 pitchers tabled was irrelevant, while the correlation of the K-BB differential was quite high.

The differential simply subtracts walks from strikeouts, so even though Wells has the higher rate, his +54 paled in comparison to Schilling’s +139. The correlations were important because if the goal is to measure a specific aspect of performance—limiting walks and whiffing batters in this case—but one metric or rate proves more telling than another, we want to use that more informative rate. The K/BB ratio is more commonly used because of the familiarity associated with it, but if K-BB, or (K-BB)/PA is more predictive, then it is a better indicator of value because it offers more assistance in the decision-making department.

Is it a better indicator of value? If so, we would expect it to correlate with a common value-laden metric more than the K/BB ratio. To that end, I pooled every pitcher with 150 or more innings in a season from 2000-09 and ran a correlation measuring the relationship between (K-BB)/PA and ERA, while repeating the test for K/BB and K/UBB in order to see which bond proved stronger. Of the 960 pitchers from 2000-09 with 150 or more frames in a season, the r for K/UBB to ERA is -0.48, for K/BB to ERA is -0.49, and for (K-BB)/PA and ERA is -0.58. For those wondering whether or not these numbers are significant, the answer is yes because, in the context of baseball, correlation coefficients above 0.45 tend to matter more than they would in many other fields.

The inverse relationship suggests that as one goes up the other goes down—higher differential leads to a lower ERA and vice-versa—and the strikeout and walk differential clearly wins out in this regard. Backtracking to Wells and Schilling, the comments indicated that the split-half correlation comparisons between the differential and the ratio favored the former more than the latter, which is important because the differential has a stronger relationship with a common metric used to assess value.

With that in mind, comparisons of Lee’s strikeout and walk prowess at this juncture should involve finding whether or not anyone else, at a similar point in a given year, had a differential as vast if not more. Those hypothetical pitchers would then be used to potentially determine expected values for Lee over the remainder of the season. Currently, Lee has thrown 121 2/3 innings and has a +90 differential—97 punchouts and just seven walks. Using the same database techniques as last week—finding pitchers with similar numbers through a relatively equivalent point of the season—the tables below show the 10 highest K-BB differentials, as well as their differentials from that point forward:

Name

Year

IP1

K-BB1

DIFF1

(K-BB)/PA

Curt Schilling

2002

122.1

170-12

158

.332

Randy Johnson

2001

126.2

189-40

149

.293

Pedro Martinez

1999

124.2

170-22

148

.303

Pedro Martinez

2000

121.0

162-24

138

.294

Randy Johnson

1995

125.1

176-39

137

.267

Randy Johnson

2000

123.2

164-27

137

.292

Randy Johnson

1999

129.2

171-36

135

.259

Pedro Martinez

2002

129.0

158-28

130

.249

Curt Schilling

1998

123.2

157-30

127

.258

Curt Schilling

2001

129.1

141-18

123

.238


 And how they did from that point forward?
 

Name

Year

IP1

K-BB1

DIFF1

(K-BB)/PA

Curt Schilling

2002

137.0

146-21

125

.231

Randy Johnson

2001

123.0

183-31

152

.313

Pedro Martinez

1999

88.2

143-15

128

.369

Pedro Martinez

2000

96.0

122-8

114

.328

Randy Johnson

1995

89.0

118-27

91

.258

Randy Johnson

2000

125.0

183-49

134

.251

Randy Johnson

1999

142.0

193-34

159

.284

Pedro Martinez

2002

70.1

81-12

69

.261

Curt Schilling

1998

145.0

143-31

112

.188

Curt Schilling

2001

127.1

152-21

131

.259

OK, OK, so there isn’t much variety here, which just goes to show how rare it is for pitchers to exhibit this type of dominance in the comparison of strikeouts and walks. Additionally, the top 10 differentials and rates in the first table make Lee’s +90 and .189 look like puny, little girly-men. Though only three pitchers show up here, the aggregate rates do not fall that much over the second half, which is to be expected if we assume that there is a high correlation between first- and second-half differential. The numbers up to are better, for sure, but the decline from that point forward is nowhere near as significant as, say, David Wells falling from a 14.50 K/BB ratio to a 2.69 in his second half.

Anyway, the point here isn’t necessarily to showcase what might happen to Lee moving forward but rather to discuss, with the help of examples, why we use certain numbers and how we can choose the appropriate weapon from our arsenal. The goal is to determine value more often than not, and in order to do that, it is more helpful to find numbers that are likely to persist over the course of a season and into the next, and share a strong bond with numbers generally assigned to value. When discussing strikeout rates and walk rates, it is apparently more informative to use some form of the strikeout and walk differential, as it encompasses what the strikeout-to-walk ratio attempts to provide, while also accounting for the shortcomings.

Lee might still set the record for the single-season K/BB ratio with 150 or more innings, but over the last 40 years (1970-2009), the highest differential belongs to Randy Johnson’s 2001 season—a +303 differential. Color me skeptical Lee reaches that threshold. This does not take anything away from his fabulous season, but it lends itself to the idea that by obsessing over his ratio, we are actually asking the wrong questions. Rates and raw tallies have their places to be used, but hopefully this sheds light on when to use different types of rates and how to decide.

Eric Seidman is an author of Baseball Prospectus. 
Click here to see Eric's other articles. You can contact Eric by clicking here

8 comments have been left for this article.

<< Previous Article
Premium Article California League Tour... (07/21)
<< Previous Column
Premium Article Checking the Numbers: ... (07/16)
Next Column >>
Premium Article Checking the Numbers: ... (07/28)
Next Article >>
Transaction Action: AL... (07/21)

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM JULY 21, 2010
Premium Article On the Beat: Staying Positive in a Negative ...
Premium Article Future Shock: The Three R's of Systems Deep ...
Premium Article Fixing The Astros, Part 1
Transaction Action: ALtruisms
Premium Article California League Tour, Part 1

MORE BY ERIC SEIDMAN
2010-07-30 - Premium Article Checking the Numbers: Hurt When it Hurts the...
2010-07-28 - Premium Article Checking the Numbers: Deadline Confusion
2010-07-23 - Premium Article Seidnotes: We Can Hit but We Just Can't Scor...
2010-07-21 - Premium Article Checking the Numbers: To Subtract or Divide
2010-07-16 - Premium Article Checking the Numbers: Where Will Oswalt Go?
2010-07-14 - Premium Article Seidnotes: K/BB Ratio Redux
2010-07-08 - Seidnotes: The K/BB Ratio
More...

MORE CHECKING THE NUMBERS
2010-08-10 - Checking the Numbers: '90s Nine, Meet the '0...
2010-07-30 - Premium Article Checking the Numbers: Hurt When it Hurts the...
2010-07-28 - Premium Article Checking the Numbers: Deadline Confusion
2010-07-21 - Premium Article Checking the Numbers: To Subtract or Divide
2010-07-16 - Premium Article Checking the Numbers: Where Will Oswalt Go?
2010-07-07 - Premium Article Checking the Numbers: Weaver's Soaring Strik...
2010-06-30 - Premium Article Checking the Numbers: A No-No
More...