<< Previous Article

Prospectus Hit and Run... (03/12)

Next Article >>

Circling The Bases: Cl... (03/14)

March 14, 2010

Great Expectations

A Context-Dependent Approach

by Dan Malkiel

Printer-friendly

Contact Author

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

In the A’s-Rangers game last Sept. 15, Oakland's Mark Ellis led off the fourth inning with a single to left. Daric Barton followed with a double to deep center, sending Ellis to third. Next up was Cliff Pennington, whose fly ball to left was too shallow to score Ellis. Adam Kennedy then struck out, and the A’s were on the verge of squandering a terrific run-scoring opportunity until Rajai Davis saved the day with a bloop single to right, driving in both runners.

Whose contribution to the rally was most valuable? This question is rather subjective and open to interpretation. Most fans would probably choose Davis because of his clutch two-RBI knock, but it’s possible to make cases for others as well. Ellis’ single got everything started, and we all know how crucial it is to get the leadoff man aboard. Barton’s long drive turned a promising start into a potentially huge inning; it wasn’t his fault that the two subsequent batters failed to cash it in.

Suppose we rephrase the question: Which player contributed the most expected runs to his team? In other words, who did the most to increase his team’s run-scoring potential? We now have an objective question that we can answer using an expected runs matrix, which indicates the average number of runs scored in the remainder of the inning for each of the 24 base/out states. Below is the matrix from 2009 along with the relative frequency of each state, which will be necessary later. The text at the start of the state name indicates the number of outs, while the three-digit sequence indicates which bases are occupied; thus, “two023” means “two outs, runners on second and third.”

State	Expected Runs	Frequency
zero000	0.52	24.2%
zero003	1.31	0.3%
zero020	1.14	1.8%
zero023	2.01	0.4%
zero100	0.88	5.5%
zero103	1.77	0.5%
zero120	1.48	1.4%
zero123	2.28	0.4%
one000	0.28	17.2%
one003	0.97	1.1%
one020	0.69	3.0%
one023	1.41	0.9%
one100	0.53	6.4%
one103	1.20	1.1%
one120	0.92	2.6%
one123	1.56	1.0%
two000	0.11	13.6%
two003	0.37	1.6%
two020	0.32	3.9%
two023	0.56	1.0%
two100	0.22	6.1%
two103	0.52	1.4%
two120	0.46	3.2%
two123	0.75	1.2%

Ellis led off the inning, a situation (zero000) with an expected value of .52 runs; after his walk, the situation became zero100, which has an expected value of .88 runs. We can therefore credit Ellis with .36 expected runs. A similar calculation for Barton shows that he contributed 1.13 expected runs. For Davis, the arithmetic is a bit different; he plated two runners, for which he is credited with two expected runs, but because he transformed a two023 situation into a two100 situation, he must be charged with -.34 expected runs. By this calculus, Davis’ contribution of 1.66 expected runs to the rally was most significant, in accordance with what I suspect would be most fans’ intuitive judgment.

The methodology above (which mirrors that of the WX family of pitching statistics) was originally developed by Gary Skoog in The 1987 Bill James Abstract. His Marginal Runs Created statistic, a.k.a. “RC1,” measures productivity by summing up a batter’s expected runs added (or lost) over the course of a season. To do this, we take three values from each PA:

1) RE1: the expected runs of the base/out state just before the payoff pitch

2) RE2: the expected runs of the base/out state immediately following the PA

3) Play_Runs: The number of runs scored on the play. Note that this is not RBI; for example, a runner scoring on a double play will count here.

We then calculate the expected runs added (or lost) during the PA as Play_Runs + RE2 – RE1. There’s one important exception: If the subsequent base/out state is “bases empty, none out” and the current one is not, then we know the PA ended the inning. In that case, the formula is simply Play_Runs – RE1.

Finally, we sum up the results of the hitter’s plate appearances to get his total expected runs added (or lost) for the season. Here are the top ten in RC1 in 2009:

Batter	PA	RC1
Albert Pujols	700	72.25
Prince Fielder	719	65.79
Joe Mauer	606	56.00
Derrek Lee	615	52.03
Jason Bay	638	44.85
Hanley Ramirez	652	44.69
Kevin Youkilis	588	44.67
Joey Votto	544	44.60
Mark Teixeira	707	44.10
Ben Zobrist	599	42.10

Not much to say here. The list is composed of 10 of the best hitters in baseball. Some fans might be surprised by the presence of Votto and Zobrist, who lack the name recognition of the rest, but who were excellent in 2009.

What about the underachievers?

Batter	PA	RC1
Dioner Navarro	410	-39.07
Yuniesky Betancourt	508	-25.77
Cesar Izturis	412	-25.63
Willy Taveras	437	-25.35
Bill Hall	365	-24.02
Ivan Rodriguez	448	-23.48
Emilio Bonifacio	509	-23.46
Adrian Beltre	477	-22.60
Adam Everett	390	-22.60
Gerald Laird	477	-21.83

Dioner Navarro absolutely killed the Rays in 2009, nearly lapping the field here; posting a .583 OPS in 410 PA will do that. The rest weren’t quite that bad, but they fared very poorly in RC1 because they failed to produce in key situations.

RC1 is a counting stat, but what if we want to see who contributed most per plate appearance? One way is to convert RC1 to a rate stat by dividing by PA, but this will favor hitters who happened to come to the plate during high-leverage circumstances more often. A fairer approach is to take each player’s average expected runs added in each base/out state, then take the weighted mean of these averages, using the league-wide frequency of each state (see the table above) as the weights. This version, which I’ll call “Weighted Runs Created” or “WRC,” allows us to compare how many expected runs per PA players would have added if they were given equal opportunities. Here are the top and bottom 10 in WRC in 2009 (minimum 300 PA):

Batter	WRC
Joey Votto	0.127
Albert Pujols	0.111
Prince Fielder	0.094
Derrek Lee	0.094
Joe Mauer	0.092
Aramis Ramirez	0.088
Kevin Youkilis	0.085
Hanley Ramirez	0.082
Jim Thome	0.080
Pablo Sandoval	0.069

A big surprise here: Votto rockets to the top, leaving Pujols in the dust. Even the most ardent Reds fan wouldn’t argue that Votto is a better hitter than Pujols, but WRC tells us that Votto produced considerably more expected runs per plate appearance in 2009.

Batter	WRC
Dioner Navarro	-0.091
Bill Hall	-0.067
Cesar Izturis	-0.066
Willy Taveras	-0.065
Alex Cora	-0.063
Ronny Cedeno	-0.058
Adam Everett	-0.056
Yuniesky Betancourt	-0.055
Alex Gonzalez	-0.053
Jack Hannahan	-0.049

I said it once before, but it bears repeating: Dioner Navarro was absolutely terrible in 2009.

So, what are RC1 and WRC good for? The vast majority of hitting metrics (TAv, VORP, OPS+, etc.) ignore context, and with good reason: A hitter is not responsible for the situations in which he comes to the plate and a double is worth more than a single in the abstract. That said, accounting for context provides a different perspective and reflects the intuition (as illustrated in the example above) that some singles are worth more than some doubles. In addition, RC1 can identify unproductive players that are just good enough to hold down a lineup spot—H.A.C.K.I.N.G. M.A.S.S. participants, take note. Finally, I think WRC would be useful for comparing players from different eras. For example, to get an idea of what 2009 Joey Votto would have done in 1965, we can compute his WRC using 1965 base/out frequencies and run expectations. This adjusts for the fact that these frequencies and run expectations have changed over the years, providing a level playing field for Votto and Willie Mays.

As for drawbacks, I doubt that RC1 and WRC have much predictive value (though this hypothesis could be tested). They are derived from a hitter’s performance in each base/out state, and there’s scant evidence that state-based (e.g., “clutch”) performance is a repeatable skill. With apologies to Reds fans, don’t expect Votto to repeat his 2009 league-leading WRC performance. Thus, while RC1 and WRC may be of limited value for looking forward, they provide a useful and interesting lens for looking back.

Dan Malkiel is an author of Baseball Prospectus.
Click here to see Dan's other articles. You can contact Dan by clicking here

Related Content: The Who, Runs

16 comments have been left for this article.

BP Comment Quick Links

code of conduct

dalbano

(11458)

I think RC1 and WRC are more a function of overall team offensive performance combined with productive hitters in the middle of a solid lineup.

Maybe this says more about where a particular player SHOULD be hitting in a particular lineup, vs what a particular player will accomplish on his own in the future. For example, Beltre should not be a 4 or 5 hitter for a major league club.

Mar 14, 2010 11:34 AM