"Support-Neutral" Statistics -- A Method of Evaluating the True Quality of a Pitcher's Start

by Michael Wolverton


This article originally appeared in By The Numbers, SABR's Statistical Analysis subcommittee's newsletter, Volume 5, Number 4, Dec. 1993. The only change made to the original article is the regeneration of Figure 1 with new parks and values (the original numbers were lost), and the corresponding modification of the text describing that figure. One significant change has been made to the method for calculating the SN stats since this article was published; that change is documented in the summary.

Motivation

In recent years, we've seen the development and growing use of two measurements designed to evaluate starting pitchers on a game-by-game basis: Quality Starts and Game Score. Both measures are attempting in some way to look at the quality of each outing the starter has, rather than looking at the average or cumulative performance over the course of year like ERA does. But both measures have their weaknesses as total measures of pitching performance.

The arguments against Quality Starts are well known. Detractors point out that the worst qualifying outing -- 6 innings and 3 earned runs -- is not "quality" at all. A related objection is that Quality Starts makes no attempt to quantify the degree of quality a start has -- 6 innings, 3 runs is the same as 8 innings, 2 runs which is the same as a 9-inning shutout.

Partly in answer to these objections, Bill James developed the Game Score, which combines a starter's box score numbers (IP, H, ER, R, BB, K) using weights, where the weights are assigned such that the league average score is around 50, the best imaginable score is around 100, and the worst imaginable score (by someone outside the state of Colorado) is around 0. Game Score is acknowledged as an interesting measure of "game domination" by a starter, but it has weaknesses as a total measure of starter quality (i.e., his contribution to team victories): it's too dependent on strikeouts, possibly too dependent on hits and walks (after all, the number of runs given up is really the only thing that matters), and it isn't park-adjusted.

Despite the weaknesses of these two measures, looking at a pitcher's starts game-by-game is still a good idea. Looking at each start's contribution to winning, rather than cumulative run-prevention over the course of a year (ERA or Pitching Runs), can help us answer questions like: Given equal ERAs, do some pitchers pitch in a way that will tend to win more games than other pitchers? In particular, is it better for a starter to be flaky -- either very good or very bad on a given day -- or consistently average? Does the park have a smaller influence on the value of the start when the start is very good or very bad?

So here's what we'd like out of a stat measuring the quality of a start:

I've developed a couple of measurements that meet these five requirements. (Actually, the ideal stat would also be very easy to compute, but hey, 5 out of 6 isn't bad, right?). Support-Neutral Wins and Support-Neutral Losses (SNW and SNL) measure the expected number of wins and losses a pitcher would have with his outings, if he got average support from his offense and his bullpen. Support-Neutral Value Added (SNVA) measures the total number of games that an average team would win given the pitcher's starts, over the number of games they'd win with a league average starter. All of these stats are computed using only the number of innings pitched, number of runs given up, and the park the game was pitched in. SNVA may be a slightly more accurate measure of a starter's actual value compared to league average, but the SNW/L record has the advantage of being flexible and more understandable. Both of them, in my opinion, constitute an improvement over Thorn and Palmer's Pitching Runs as a total measure of starter worth.

 

Support-Neutral Wins and Losses

Support-Neutral Wins is calculated by determining the probability that a pitcher would get the win for each start he has, and then summing up the individual probabilities over all of his starts. The sum gives you the number of wins a pitcher could expect to get for an average team, given his performances. A "performance" here consists only of the number of innings pitched, the number of runs (not earned runs) given up, the park in which the game was played, and whether the pitcher was at home or on the road -- SNW assumes that these are the only things which influence whether the pitcher wins or loses.

The rest of this section describes the formulas that are used to calculate SNW; readers who aren't interested in the specific methods of calculation are welcome to skim or skip to the next section.

To calculate the probability that a pitcher wins the game, we just need to look at the definition of a win: A starting pitcher wins the game if his team has the lead when he's taken out of the game, and they never relinquish that lead. So, for a given outing by the starter, the probability that he gets the win is just the probability that his team will take the lead (score more runs than the starter gives up) by the time he's removed times the probability that they'll hold that lead until the game is over.

To put this into a formula, we just need to determine and add up the probabilities of all the different ways his team can take and hold a lead:

where

SNW(i, r) is the probability a starter who goes i innings and gives up r runs will get the win, given an average team playing behind him.

PScore(i, r) is the probability that an average team will score r runs in i innings.

PHold(k, i) is the probability that an average team will hold a k-run lead (without ever relinquishing it) for the i remaining innings until the end of the game.

The above formula is actually a simplification of the formula I use in my software to calculate SNW (I'll refer to the formula in my software as the "real" SNW formula). In order to make it easier to explain, I made a few assumptions to get the formula above. First, that formula assumes that the starter comes out of the game after pitching a full inning (i.e., he pitches no extra thirds of an inning). The formula is complicated somewhat when thirds of an inning are taken into account, but the same general idea applies: his team must be leading when he comes out, and his team must hold the lead for the extra thirds in the inning he leaves, plus all the rest of the remaining innings. The real SNW formula does take thirds of an inning into account.

Second, the above formula doesn't explicitly take the park into account. To take park effects into account, we need to make SNW, PScore, and PHold be functions of the park in which the game is played. A hitter's park should inflate the probabilities that an average team will score a high number of runs, and a pitcher's park should do the opposite. The real SNW formula does take park into account. I talk a little more about my handling of park effects in the Appendix.

Third, the above formula doesn't take into account whether the starter is pitching at home or on the road. Maybe contrary to intuition, this does make a difference. Consider a starter who leaves after pitching the 7th inning: if he's at home, he's pitched the top of the 7th, so he gets credit for the runs his team scored in the first 6 innings, plus the runs they score in the bottom of the 7th; if he's on the road, however, he pitched the bottom of the 7th, so he gets credit for the runs his team scored in the first 7 innings, plus the runs they score in the top of the 8th. So, all other things being equal, it's easier for pitchers to get wins (and harder for them to get losses) when they pitch on the road. The formula above is for a pitcher pitching at home, and the road formula is slightly different. The real SNW formula does take home/road status into account.

Finally, the above formula doesn't quite reflect the full definition of a pitcher's win -- a starter can't get the win unless he goes 5 innings or more. Presumably, this extra condition was put into the win rule to reduce the number of undeserving starters getting lucky wins. But when you're assigning fractions of a win, rather than 1 win or 0 wins, there's no possibility of getting lucky. So, the real SNWL formula does not take the five-inning condition into account, although, for the purposes of comparison, I do calculate an expected win (E(W)) number which is equal to 0 if the pitcher goes less than 5 innings and equal to SNW otherwise.

Let's finish off the formula above. PScore is easy to find recursively, provided you know an average team's single-inning scoring distribution, PInningScore:

where

PInningScore(r) is the probability that an average team will score r runs in an inning.

PHold is a little more complicated, since you have to see to it that the pitcher's team never relinquishes the lead. Still, it's not too hard to reduce it to the following (below, "tr" stands for the number of runs the pitcher's team scores in an inning, and "or" stands for the number the opposing team scores in an inning):

The only remaining unknown is the single-inning scoring distribution, PInningScore. But that's readily available from linescores of past games. The scoring distribution (separate distributions for each league) I'm using right now was taken from a few weeks of linescores in USA TODAY from late-April and early-May of 1992. I'll probably be able to get a more accurate distribution someday, but I'm sure that this one is close enough.

The SNL value for a single start is calculated analogously to SNW.

 

Support-Neutral Value Added

SNW and SNL gives us a nice way of getting a "fair" W/L record for a starter, which can then be used to compare to his actual W/L record, or a replacement-level winning percentage, etc. (see the Results section). But these numbers calculate how likely it is that the pitcher will win or lose the game -- i.e., get the "W" or "L" next to his name in the box score. A related but slightly different notion is the likelihood that the team will win when a pitcher takes the mound. In measuring the starter's contribution to team victories, we'd like to evaluate how much the outing by the starter changes the team's chance of winning from what it was at the beginning of the game (which I'll assume to be 50%). This is what SNVA is designed to measure.

Not surprisingly, the formula for SNVA looks pretty similar to the formula for SNW:

where

SNVA(i, r) is the difference between an average team's chance of winning after the starter has left after pitching i innings and giving up r runs, and their chance of winning at the beginning of the game (50%).

PScore(i, r) was defined above

PATWin(r, i) is the chance that an average team will eventually win the game given that there are i innings left and the difference between their score and their opponents' score is r.

Also not surprisingly, PATWin looks a lot like PHold:

What SNVA gives us (when summed over all a pitcher's starts) is the number of games in the standings he's worth to his team above the average starter. Of course, this is exactly the same unit (games above the average player) that all of Total Baseball's [1] measurements are in. So it'll be interesting to compare SNVA to Thorn and Palmer's Adjusted Pitching Runs1 to see how well they correlate and also where the differences lie.

 

Results

Best, worst, luckiest, and unluckiest starters of 1992

That's enough of the gory details of the calculation of the stats. Let's look at the fun stuff -- what the stats tell us about real pitchers. I tracked all starting pitchers in the majors over the 1992 season, and Tables 1 and 2 show the top pitchers in both leagues for 1992. Each table shows the pitcher's Support-Neutral Wins (SNW), Losses (SNL), and Winning Percentage (SNPct), followed by his actual win-loss record (W, L), his runs allowed per 9 innings (RA), his Adjusted Pitching Runs (APR), and his Support-Neutral Value Added (SNVA). Interestingly, Greg Maddux, with the fabulous year he had pitching in Wrigley, was the only pitcher in either league who came close to "deserving" to win 20 games.

Pitcher

Team

SNW

SNL

SNPct

W

L

RA

APR

SNVA

Mussina

BAL

17.2

7.8

.688

18

5

2.61

47.0

4.60

Clemens

BOS

17.5

8.5

.674

18

11

2.92

43.8

4.39

Appier

KCR

15.2

6.6

.698

15

8

2.55

42.6

4.08

Guzman,Ju

TOR

13.4

6.4

.679

16

5

2.79

32.3

3.34

Nagy

CLE

16.3

9.9

.623

17

10

3.25

33.3

3.11

Eldred

MIL

8.2

2.4

.776

11

2

1.88

28.1

2.81

McDowell

CHI

16.3

10.7

.602

20

10

3.28

30.5

2.53

Smiley

MIN

16.0

10.5

.603

16

9

3.47

28.3

2.75

Navarro

MIL

15.8

10.8

.595

17

11

3.59

22.8

2.45

Abbott,J

CAL

13.6

8.6

.612

7

15

3.11

27.7

2.36

Viola

BOS

15.8

11.2

.586

13

12

3.74

21.4

2.35

Fleming

SEA

15.1

10.7

.586

17

10

3.73

19.7

2.10

Perez,M

NYY

14.9

10.5

.586

13

16

3.42

26.3

1.90

Wegman

MIL

15.6

11.5

.576

13

14

3.58

24.4

2.06

Erickson

MIN

13.3

9.8

.574

13

12

3.65

20.8

1.75

Bosio

MIL

14.3

11.1

.563

16

6

3.89

13.6

1.52

Key

TOR

13.3

10.4

.561

13

13

3.66

18.0

1.42

Brown,K

TEX

15.5

12.9

.545

21

11

3.96

14.5

1.18

Welch

OAK

8.0

5.8

.580

11

7

3.42

9.7

0.91

Rasmussen

KCR

3.0

0.8

.785

4

1

1.67

11.4

1.09

Table 1: Top 20 AL Starters in 1992, ranked by SNW-SNL

 

Pitcher

Team

SNW

SNL

SNPct

W

L

RA

APR

SNVA

Maddux,G

CHI

19.5

7.4

.724

20

11

2.28

53.9

5.75

Tewksbury

STL

16.1

7.3

.687

15

5

2.45

38.5

4.12

Schilling

PHI

13.9

6.8

.670

12

9

2.59

31.1

3.37

Morgan

CHI

16.3

9.5

.632

16

8

3.00

30.4

3.22

Rijo

CIN

13.9

8.1

.632

15

10

2.86

28.5

2.57

Smoltz

ATL

16.6

11.0

.601

15

12

3.28

25.1

2.67

Glavine

ATL

15.1

9.8

.608

20

8

3.24

23.9

2.71

Martinez,D

MON

14.5

9.1

.613

16

11

2.98

24.0

2.50

Swindell

CIN

13.8

8.5

.619

12

7

3.05

24.5

2.56

Swift

SFG

10.4

5.1

.670

9

3

2.36

23.6

2.51

Drabek

PIT

15.9

10.8

.595

15

11

2.95

26.7

2.32

Fernandez,S

NYM

13.6

8.8

.608

14

11

2.81

24.7

2.25

Hill

MON

14.0

10.0

.583

16

9

3.14

19.3

1.93

Leibrandt

ATL

13.5

9.5

.586

15

7

3.68

11.8

2.02

Smith,P

ATL

5.6

2.1

.724

7

0

2.22

16.2

1.69

Wakefield

PIT

6.4

3.3

.656

8

1

2.54

13.8

1.42

Rivera

PHI

6.5

3.7

.639

7

3

2.95

10.8

1.34

Benes

SDP

14.0

11.3

.553

13

14

3.50

10.7

1.22

Portugal

HOU

6.6

4.0

.621

5

3

2.69

12.5

1.18

Avery

ATL

14.0

11.5

.549

11

11

3.66

14.8

1.15

Table 2: Top 20 NL Starters in 1992, ranked by SNW-SNL

 

On the flip-side, Tables 3 and 4 show the worst 10 starting pitchers in 1992 for each league.2 Not surprisingly, many of these guys showed up in different uniforms in 1993, several on expansion teams.

 

Pitcher

Team

SNW

SNL

SNPct

W

L

RA

APR

SNVA

Armstrong

CLE

5.2

11.5

.313

3

15

6.37

-28.4

-3.08

Milacki

BAL

4.5

9.5

.320

6

8

6.18

-21.8

-2.32

Terrell

DET

2.9

7.5

.280

3

6

6.98

-22.9

-2.26

Slusarski

OAK

2.5

6.9

.265

5

5

6.25

-18.7

-2.05

Sanderson

NYY

9.9

14.0

.414

12

11

5.40

-22.4

-2.02

Aldred

DET

2.4

6.5

.273

2

7

7.63

-21.7

-1.89

McCaskill

CHI

10.2

14.2

.417

12

13

5.00

-16.1

-1.92

Wells

TOR

3.4

6.9

.332

6

7

7.70

-27.7

-1.81

Stieb

TOR

3.4

6.7

.337

3

6

5.92

-13.3

-1.50

Otto

CLE

3.9

7.2

.354

5

9

6.75

-19.8

-1.57

Table 3: Bottom 10 AL Starters in 1992, ranked by SNW-SNL

 

Pitcher

Team

SNW

SNL

SNPct

W

L

RA

APR

SNVA

Bowen

HOU

0.6

6.1

.094

0

7

12.22

-31.3

-2.61

Wilson,T

SFG

7.0

11.3

.384

8

14

4.79

-18.5

-2.03

Abbott,K

PHI

4.5

8.3

.352

1

14

4.92

-11.4

-1.84

Martinez,R

LAD

7.3

11.1

.397

8

11

4.90

-19.1

-1.84

Henry,B

HOU

8.3

11.7

.414

6

9

4.40

-12.4

-1.57

Young,A

NYM

2.8

6.2

.313

1

7

5.79

-16.8

-1.63

Black

SFG

8.6

11.9

.420

10

12

4.47

-14.7

-1.54

Hershiser

LAD

10.2

13.3

.434

10

15

4.31

-12.5

-1.60

Hammond

CIN

7.0

10.0

.409

7

10

4.61

-7.4

-1.36

Blair

HOU

1.4

4.5

.241

1

5

7.51

-16.8

-1.52

Table 4: Bottom 10 NL Starters in 1992, ranked by SNW-SNL

 

This method also allows you to evaluate the level of luck a pitcher experienced in his W/L record -- i.e. it allows you to look at how much a pitcher's actual W/L record differs from his expected W/L record given the way he pitched. Tables 5 through 8 show the luckiest and unluckiest starters in each league in 1992. No one should be surprised that Jack Morris, who compiled a 21-6 record despite a 4+ ERA, was far and away the luckiest starter in either league last year. SNW/L evaluation shows that you'd expect his 1992 performance to produce a 13-13 mark if he had gotten average support. Equally unsurprising is the result that Jim Abbott was the unluckiest pitcher in either league. The Angels gave him enough support only for a miserable 7-15 record, while his pitching actually merited something closer to 13-9.

Pitcher

Team

E(W)

E(L)

W

L

Diff

Morris

TOR

13.3

13.1

21

6

14.7

Brown,K

TEX

15.5

12.9

21

11

7.4

Moore

OAK

12.1

14.3

17

12

7.3

Bosio

MIL

14.2

11.1

16

6

6.9

Hibbard

CHI

8.1

11.3

10

7

6.2

Darling

OAK

11.5

12.5

15

10

6.0

Sanderson

NYY

9.7

14.0

12

11

5.3

Wickman

NYY

2.7

2.9

6

1

5.2

Slusarski

OAK

2.3

6.9

5

5

4.6

McDowell

CHI

16.2

10.7

20

10

4.5

Table 5: Luckiest 10 AL Starters in 1992, ranked by W-E(W) + E(L)-L

Pitcher

Team

E(W)

E(L)

W

L

Diff

Abbott,J

CAL

13.3

8.6

7

15

-12.7

Perez,M

NYY

14.6

10.5

13

16

-7.0

Hanson

SEA

9.6

12.8

7

17

-6.8

Armstrong

CLE

5.2

11.5

3

15

-5.8

Wegman

MIL

15.6

11.5

13

14

-5.1

Valera

CAL

10.4

9.6

7

11

-4.8

Kamieniecki

NYY

8.8

12.0

6

14

-4.7

Ryan

TEX

9.3

8.7

5

9

-4.6

Chiamparino

TEX

1.5

1.3

0

4

-4.2

Reed

KCR

5.1

6.0

2

7

-4.1

Table 6: Unuckiest 10 AL Starters in 1992, ranked by W-E(W) + E(L)-L

Pitcher

Team

E(W)

E(L)

W

L

Diff

Burkett

SFG

9.6

13.0

13

9

7.5

Glavine

ATL

15.0

9.8

20

8

6.8

Seminara

SDP

5.1

6.3

9

4

6.2

Lefferts

SDP

8.4

10.4

13

9

6.0

Tomlin

PIT

11.5

11.3

14

9

4.8

Hurst,B

SDP

12.4

12.1

14

9

4.7

Cone

NYM

11.3

9.9

13

7

4.6

Leibrandt

ATL

13.2

9.5

15

7

4.3

Osborne

STL

9.2

11.4

10

8

4.2

Wakefield

PIT

6.3

3.3

8

1

4.0

Table 7: Luckiest 10 NL Starters in 1992, ranked by W-E(W) + E(L)-L

Pitcher

Team

E(W)

E(L)

W

L

Diff.

Abbott,K

PHI

4.5

8.3

1

14

-9.2

Candiotti

LAD

11.8

10.5

10

15

-6.3

Gross,Ke

LAD

10.9

10.9

8

13

-5.0

Clark,M

STL

5.4

8.0

3

10

-4.4

Schilling

PHI

13.9

6.8

12

9

-4.0

Benes

SDP

13.8

11.3

13

14

-3.5

Carter

SFG

1.5

2.4

1

5

-3.1

Boskie

CHI

3.2

7.1

3

10

-3.1

Maddux,G

CHI

19.5

7.4

20

11

-3.1

Whitehurst

NYM

2.3

3.3

1

5

-3.0

Table 8: Unluckiest 10 NL Starters in 1992, ranked by W-E(W) + E(L)-L

League total numbers

In theory, the support-neutral record of the entire league should come close to the actual win-loss record of the league, and in fact, in 1992, SNW/L did appear to predict league W/L pretty well. Table 9 shows both the expected and actual W/L totals for each league in 1992. The National League's record corresponded very well to the record expected by the model, with no-decisions being underpredicted only slightly by SNW/L. The American League is predicted a little less successfully -- there were nearly 30 more wins in the league than expected, and nearly 10 more losses than expected. I believe that part of the discrepancy between expected record and actual record can be explained by the fact that relief pitchers prevented runs better than starters in 1992. Since starters are competing for the (actual) decision primarily with the other starter, it makes sense that starters would get a few more (actual) wins than predicted by a model which has them competing with league average pitching for the decision.

 

E(W)

E(L)

E(Pct.)

W

L

Pct.

NL

660.9

690.3

.489

655

678

.491

AL

776.1

846.7

.478

805

837

.490

Table 9: Expected and Actual records of all starters in the leagues

 

 

Value of "flaky" and "steady" pitchers

Do the Support-Neutral stats tell us anything that Thorn and Palmer's Adjusted Pitching Runs weren't already telling us? Since both APR and SNVA are trying to measure exactly the same thing (albeit by different methods), we'd expect there to be a pretty strong correlation between them. There is. For most pitchers, SNVA (whose unit is "games above average") is approximately equal to one-tenth of APR (whose unit is "runs above average"). This is what you'd expect given the well-known result that each 10 runs prevented (or gained) leads on average to about 1 extra win in the standings (see, e.g., [2]). However, there are plenty of cases where APR and SNVA give significantly different evaluations. Look at the 1992 records of Charlie Leibrandt and Melido Perez:

                      APR    SNVA
        Leibrandt    11.8    2.02
        Perez,M      26.3    1.90

APR evaluates Perez as being 14.5 runs -- about one-and-a-half games -- better than Leibrandt. However, SNVA shows that, when the pitchers' performance is evaluated game-by-game, Leibrandt was actually a little better than Perez.

The key to this discrepancy between the two measurements is found in the amount of consistency the two pitchers exhibited in their starts. Perez was a model of consistency last year; he rarely got bombed, but he also was rarely dominating. Leibrandt, on the other hand, was one of the least consistent pitchers in the majors. And that is the most surprising result I've seen so far from these SN stats: run-prevention stats such as ERA and APR tend to undervalue flaky pitchers, and overvalue consistent ones, at least when you consider them pitching for an average team. Tables 10 through 13 show the "flakiest" (most inconsistent) and "steadiest" (most consistent) pitchers in the leagues last year, as evaluated by the variance of the SNVA of their individual starts. You can see from those tables that APR pretty consistently underestimates a pitcher's value when the pitcher is flaky, and pretty consistently overestimates his value when he's steady. 9 of the 10 flakiest pitchers in both the NL and AL were underestimated by APR, and 8 of the 10 steadiest in the NL and 10 of the 10 steadiest in the AL were overestimated by APR. And the pitchers for whom there were really large discrepancies between APR and SNVA -- Leibrandt, Kyle Abbott, Gooden, Hammond, Sutcliffe, Perez, Kamieniecki, McDowell -- all showed up near the top of the predicted list.

The reason for this undervaluing is that APR counts all runs as equal, while in fact all runs do not contribute an equal amount toward winning/losing a game. In particular, Bill James did a study that showed that runs scored by a team after they've already scored 5 in a game do not contribute the same amount toward the probability of winning than those first 5 runs did [3]. So, pitchers who give up more than 5 runs in a couple of games will be undervalued by ERA and APR, because those really crummy outings probably weren't quite as crummy as ERA and APR would have you believe.

Pitcher

Team

APR

SNVA

SNVA Var

Smith,Z

PIT

3.7

0.70

0.088

Smoltz

ATL

25.1

2.67

0.083

Saberhagen

NYM

4.5

0.73

0.082

Leibrandt

ATL

11.8

2.02

0.082

Osborne

STL

-12.6

-0.97

0.079

Glavine

ATL

23.9

2.71

0.076

Hurst,B

SDP

-1.5

0.12

0.075

Cone

NYM

8.5

0.67

0.074

Belcher

CIN

1.9

0.53

0.074

Benes

SDP

10.7

1.22

0.068

Table 10: Flakiest 10 NL Starters in 1992, ranked by variance of SNVA (15 starts minimum)

Pitcher

Team

APR

SNVA

SNVA Var

Abbott,K

PHI

-11.4

-1.84

0.022

Rijo

CIN

28.5

2.57

0.032

Browning

CIN

-8.7

-1.17

0.035

Gooden

NYM

-6.1

-1.33

0.036

Hammond

CIN

-7.4

-1.36

0.041

Tewksbury

STL

38.5

4.12

0.042

Maddux,G

CHI

53.9

5.75

0.042

Fernandez,S

NYM

24.7

2.25

0.043

Boskie

CHI

-12.5

-1.47

0.044

Gardner

MON

-9.5

-1.20

0.044

Table 11: Steadiest 10 NL Starters in 1992, ranked by variance of SNVA (15 starts minimum)

Pitcher

Team

APR

SNVA

SNVA Var

Sutcliffe

BAL

-8.3

-0.33

0.089

Smiley

MIN

28.3

2.75

0.078

Krueger

MIN

-0.1

0.14

0.078

Johnson,R

SEA

1.7

0.24

0.077

Gubicza

KCR

7.3

0.78

0.075

Langston

CAL

5.6

0.77

0.073

Fleming

SEA

19.7

2.10

0.073

Viola

BOS

21.4

2.35

0.073

Rhodes

BAL

6.7

0.77

0.071

Darling

OAK

-4.8

-0.31

0.070

Table 12: Flakiest 10 AL Starters in 1992, ranked by variance of SNVA (15 starts minimum)

Pitcher

Team

APR

SNVA

SNVA Var

Armstrong

CLE

-28.4

-3.08

0.034

Darwin

BOS

8.2

0.45

0.036

Milacki

BAL

-21.8

-2.32

0.037

Kamieniecki

NYY

-8.9

-1.57

0.038

Perez,M

NYY

26.3

1.90

0.039

Reed

KCR

-0.3

-0.20

0.040

Appier

KCR

42.6

4.08

0.040

Cook

CLE

-3.6

-0.56

0.042

Hibbard

CHI

-10.7

-1.28

0.045

McDowell

CHI

30.5

2.53

0.045

Table 13: Steadiest 10 AL Starters in 1992, ranked by variance of SNVA (15 starts minimum)

As an example of this, consider a David Wells outing from 1992: he gave up 13 runs in 4+ innings. APR just subtracts his 13 runs from the number of runs a league average pitcher would have given up in those same 4 innings (about 2), and concludes that Wells was worth about -11 runs, or -1.1 games, in that start. Did Wells really cost the Blue Jays more than a game in the standings with that awful start? Of course not. He guaranteed them a loss, of course, but they had some chance of losing the game to begin with anyway -- about a 50% chance if you make the simplifying assumption that they're an average team. SNVA gives a far more reasonable value for Wells's start: it was worth about -0.5 games. That's as much as a single start can cost you. Wells didn't have the requisite 15 starts to show up in Table 12, but you can see from his record in Table 3 how much he was underestimated by APR.

 

Effect of the park on win probability

One other question I've been looking at is how the value of starts is influenced by park effects. Figure 1 shows the SNVA for a 9-inning complete game in both Atlanta's Fulton County Stadium (the NL's most extreme hitters' park in 1992) and the San Francisco's Candlestick Park (the NL's most extreme pitchers' park in 1992). We can see from the figure that the effect of the park on the value of the start is far less at the two extremes of start quality than it is for middle-of-the-road starts. The difference between Fulton County and Candlestick for the value of a 9-inning, 4-run start is almost four times as large as the difference between Fulton County and Candlestick for the value of a shutout.

Figure 1: SNVA for Fulton County Stadium (top line) and Candlestick
Park (bottom line), given that the starter pitched 9 innings

This would imply that methods of park adjustments which simply multiply a pitcher's "raw" value by a park factor might be over- or underestimating the park's actual effect on his value. Since the park's effect on very good or very bad starts is much less than on average starts, a reasonable hypothesis would be that very good or very bad pitchers deserve less of a boost (or less diminishment) to their rating than current park adjusment methods give them.

However, the preliminary investigation of this hypothesis I have done on real starting pitchers (with 1992 data) has failed to find much support for it. I'd still like to do some more work on this issue.

 

Weaknesses of the Approach

Here are a few of the problems with these measurements:

Conclusion

I've presented Support-Neutral Wins, Losses, and Value Added, three park- and league-adjusted measurements of the value of individual starts, and of starting pitchers. I feel these are a valuable addition to existing measurement methods, both because they can provide a measurement of pitcher worth in units which are familiar to all baseball fans (pitcher wins and losses) and because they seem to be a slightly more accurate measure of the true value of a start than existing methods.

Special thanks to Greg Spira, whose discussion sparked many of the ideas presented here. Thanks to David Tate and others on the Internet newsgroup rec.sport.baseball, who provided valuable feedback on the method. And thanks to my wife, Cindy, for reading this paper and giving me many useful suggestions.

 

References

[1] Thorn, J. and Palmer, P. (eds.), Total Baseball, 3rd edition, Harper Collins, New York, 1993.

[2] Thorn, J. and Palmer, P., The Hidden Game of Baseball, Doubleday Books, New York, 1985.

[3] James, B., The 1986 Bill James Baseball Abstract, Ballantine Books, New York, 1986, pp. 172-175.

 

Appendix: Park Effects

One possible way of incorporating park effect numbers into these measurements would be to take whatever final value the above formulas produce (SNW, SNL, or SNVA) and multiply it by some park effect constant for the pitcher's home park. This is essentially the approach Thorn and Palmer use in Total Baseball. But the method of calculating the Support-Neutral stats allows a potentially more informative use of park effects. Since park effects (as printed in Elias, e.g.) reflect how a park inflates or deflates average scoring ability, it makes sense to have the "average team" playing behind the pitcher effected by the park, and then calculate the likelihood that the pitcher's outing plus this park-adjusted average team will lead to a win. So for any game, the PInningScore (league average scoring) distribution is adjusted to reflect the park's effect on run scoring. The resulting number then reflects the park's effect on winning rather than cumulative run scoring/prevention.

The question then becomes: how do you translate a single park effect percentage like the ones in Elias (the only source of park effects I have) into an adjusted PInningScore distribution? There are an infinite number of ways to do this. The way I'm doing it now is to change the probability of scoring 0 runs by one factor, and change the probability of scoring i runs for i>1 by another factor, such that the total number of expected runs scored in an inning is increased/reduced by the Elias number. For example, if the Astrodome decreases scoring by 10%, I increase PInningScore(0) for the Astrodome by one factor, and decrease PInningScore(i) for i>1 by another factor, such that the expected single-inning score reflected by PInningScore is reduced by 10% from the park-neutral scoring distribution. If that isn't clear (and I'm sure it isn't), I should say that I don't think it makes much difference the exact method used.

 

Footnotes

1 Adjusted Pitching Runs is the basic metric which Thorn and Palmer (the authors of Total Baseball) use to evaluate pitchers. APR is the number of runs prevented by a pitcher that a league average pitcher would've given up. The APR that I'm using in this paper differs from Thorn and Palmer's statistic in two ways: 1) I'm using runs where Thorn and Palmer use earned runs, and 2) the method of park adjustment I use is a simplification of the one used in Total Baseball. It is included here for comparison with SNVA.

2 Actually, it's probably inaccurate to use the word "worst" here, since the method of ranking the pitchers -- ranking them according to SNW-SNL -- sets the baseline for comparison at league average (anyone below .500 gets a negative rating). Of course, it's quite possible for a below-average pitcher to still be valuable to his team. A better method of producing this list might have been to compare a pitcher's SN record to a lower baseline, e.g., a .450 pitcher. This would have left pitchers like Hershiser and McCaskill, who pitched a lot of innings at somewhat below-league-average performance, off of the lists in favor of other pitchers who pitched fewer innings but at further-below-average performance.


Baseball Prospectus Home