CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
<< Previous Article
The Lineup Card: Ten P... (08/28)
<< Previous Column
Reworking WARP: The Se... (08/21)
Next Column >>
Reworking WARP: The Un... (09/05)
Next Article >>
Premium Article Daily Roundup: Around ... (08/29)

August 28, 2013

Reworking WARP

The Overlooked Uncertainty of Offense

by Colin Wyers

Previous Installments of Reworking WARP
The Series Ahead [8/21]


When I started working on a series about revising WARP, I didn’t expect to have much to say on the subject of offense. Measuring offense is probably the least controversial part of modern sabermetrics. So why start here? I have a few reasons:

  • It’s a good place to start, foundationally. The topic of run estimation covers a lot of tools that are useful in more up-for-debate areas.
  • The goal of this series is to be inquisitive; we shouldn’t just assume anything is right. We ought to test.
  • We tend to take the relatively low amount of measurement error on offense for granted, and so neglect the measurement error we do have.

So, we’ll math. But before we math, let’s talk a bit about how sabermetricians measure offense, as opposed to what I like to call “RBI logic.” Traditional accounting of baseball offense works on two basic principles:

  • If you get on base and eventually score, you are credited with a run scored.
  • If you drive in a runner (including yourself), you are credited with a run batted in.

Ignoring some pretty silly edge cases, this reconciles with team runs scored. The problem is that it’s such a binary model—either a runner scores or he doesn’t. With baseball, though, there are outcomes that can increase the probability of a runner scoring without driving him in immediately:

· You can advance the runner, which makes him more likely to be driven in in a subsequent at-bat, and

· You can avoid making an out, which—even if you do not advance the runner in doing so—gives additional batters behind you chances to drive him in.

So RBI logic does a very good job of reconciling to team runs, by sheer force of will, but it’s a poor reflection of the underlying run-scoring process. You end up crediting players for coming up in spots where runners are in scoring position, and ignoring the contributions of players who advance runners over. You also ignore the value of not making outs.

The foundation of most modern sabermetric analysis of run scoring is the run expectancy table. Here’s a sample table, derived from 2012 data:

RUNNERS

0

1

2

000

0.489

0.263

0.101

100

0.858

0.512

0.221

020

1.073

0.655

0.319

003

1.308

0.898

0.363

120

1.442

0.904

0.439

103

1.677

1.146

0.484

023

1.893

1.290

0.581

123

2.262

1.538

0.702

Top to bottom, it goes by the runner on base—a zero indicates no runner on base, one through three indicates a runner on that base. Left to right is the number of outs in an inning. (It’s not explicitly listed on most run expectancy tables, but the three-out state is a special state in which runs expected goes to zero.) The table lists the average number of runs expected to score in the rest of the inning from that state—the lowest is with the bases empty with nobody on and two outs, at 0.101 runs expected, all the way up to the bases loaded with no outs, where 2.262 runs score on average.

What’s interesting isn’t so much the run expectancy itself, but the change in run expectancy between events. So let’s run through an example. Say you have runners on first and third, no outs. That’s a run expectancy of 1.677. Now, suppose the next hitter walks. That moves you to a bases loaded, no outs situation. That walk would be worth 0.585 runs—a pretty important walk. What if the hitter strikes out instead? That moves you into a first and third with one out situation, for a value of -0.531.

We come up with the value of each event by looking at the average run expectancy change for each event—that’s known as the event’s linear weights value. Here’s a set of linear weights values for official events in 2012:

Event

LWTS

HR

1.398

3B

1.008

2B

0.723

1B

0.443

HBP

0.314

IBB

0.174

NIBB

0.296

K

-0.261

Out

-0.246

We’ve separated the intentional walk from other walks. You’ll note that a hit-by-pitch is worth more runs than a walk—pitchers tend to issue fewer walks with first base occupied, compared to hit batters. Shockingly, a home run is worth more than a triple, a triple is worth more than a double, and so on.

Now let’s look at the same table, but with one new piece of information—the standard deviation around that average change in run expectancy:

Event

LWTS

STDERR

HR

1.398

0.533

3B

1.008

0.520

2B

0.723

0.456

1B

0.443

0.327

Out

-0.261

0.187

HBP

0.314

0.183

NIBB

0.174

0.170

K

-0.246

0.147

IBB

0.296

0.071

There is a substantial correlation between the average run value of an event and its standard error, which shouldn’t be surprising. It also tells us that the actual value of a player’s offense is more uncertain the more he relies upon power—the value of a home run is more uncertain that that of a single, after all.

We need to get into a bit of gritty math stuff here before getting to the fun stuff. What you have to remember is that the standard deviation is simply the square root of the variance around the average. In order to combine standard deviations, you have to first square them, then combine them, then take the square root again. (In other words, variances add, not standard deviations.)

Now, here’s a list of the top 20 players in batting runs above average (derived from linear weights) in 2012, along with the estimated error for each:

NAME

BRAA

STDERR

Mike Trout

61.7

6.6

Buster Posey

49.7

6.5

Miguel Cabrera

49.2

7.1

Andrew McCutchen

48.7

6.7

Prince Fielder

44.3

6.7

Edwin Encarnacion

44.0

6.5

Robinson Cano

43.6

7.0

Ryan Braun

43.5

6.8

Joey Votto

43.0

5.6

Adrian Beltre

42.7

6.8

So the difference between Mike Trout and Miguel Cabrera in 2012 was 12.5 runs. The combined standard error for the two of them (remember, variances add) is 9.7. How confident are we that Trout was a better hitter (relative to average) than Cabrera in 2012? Divide the difference by the standard error and you get 1.3—that’s what’s known as a z-score. Look up a z-score of 1.3 in a z-chart, and you get .9032—in other words, roughly 90 percent. So there’s a 90 percent chance, given our estimates of runs and our estimates of error, that Trout was the better hitter. Now, we should emphasize that a 90 percent chance that he was means there’s a 10 percent chance that he wasn’t. What if we compare Posey to Beltre? That’s a difference of seven runs, which works out to a confidence level of 77 percent that Posey was the better hitter. What about comparing Braun to Votto? That’s a difference of just half a run between them—our confidence is only about 52 percent, essentially a coin flip between them.

So what we have is a way to measure our measurement of run production, and then to apply a confidence interval to our estimates. For a full-time player (one qualified for the batting title, that is) the average standard error is roughly six runs. If you want to compare bad hitters to good hitters, sure, most of the time the difference between them far outstrips the measurement error. But if you want to compare good hitters to good hitters (which is frankly a lot more interesting, and probably a lot more common), then you’ll often find yourself running into cases where the difference between them is close to, if not lower than, the uncertainty of your measurements.

So if we can quantify our measurement uncertainty, the next question we can ask is, is there a way to measure offense that’s subject to less measurement uncertainty? I have a handful of ideas on the subject, which we’ll take a look at next week.

Colin Wyers is an author of Baseball Prospectus. 
Click here to see Colin's other articles. You can contact Colin by clicking here

35 comments have been left for this article.

<< Previous Article
The Lineup Card: Ten P... (08/28)
<< Previous Column
Reworking WARP: The Se... (08/21)
Next Column >>
Reworking WARP: The Un... (09/05)
Next Article >>
Premium Article Daily Roundup: Around ... (08/29)

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM AUGUST 28, 2013
Premium Article What You Need to Know: August's Surprising P...
Premium Article Transaction Analysis: Pirates Get Patch
Fantasy Article Sporer Report: Six September Stashes
Fantasy Article Five to Watch: September Call-Ups Unworthy o...
Premium Article Baseball Therapy: Matt Harvey and the Increa...
Feature Focus: Sortable Statistics (Introduc...
Premium Article Daily Roundup: Around the League: August 28,...

MORE BY COLIN WYERS
2013-09-19 - Reworking WARP: The Importance of a Living R...
2013-09-11 - Reworking WARP: Why We Need Replacement Leve...
2013-09-05 - Reworking WARP: The Uncertainty of Offense, ...
2013-08-28 - Reworking WARP: The Overlooked Uncertainty o...
2013-08-28 - BP Unfiltered: 2013 Hitter Uncertainty
2013-08-27 - Manufactured Runs: Are The Astros Really the...
2013-08-21 - Reworking WARP: The Series Ahead
More...

MORE REWORKING WARP
2013-09-19 - Reworking WARP: The Importance of a Living R...
2013-09-11 - Reworking WARP: Why We Need Replacement Leve...
2013-09-05 - Reworking WARP: The Uncertainty of Offense, ...
2013-08-28 - Reworking WARP: The Overlooked Uncertainty o...
2013-08-21 - Reworking WARP: The Series Ahead
More...