by Michael Wolverton
This article originally appeared in By The Numbers, SABR's Statistical Analysis subcommittee's newsletter, Volume 5, Number 4, Dec. 1993. The only change made to the original article is the regeneration of Figure 1 with new parks and values (the original numbers were lost), and the corresponding modification of the text describing that figure. One significant change has been made to the method for calculating the SN stats since this article was published; that change is documented in the summary.
Motivation
In recent years, we've seen the development and growing use of two measurements designed to evaluate starting pitchers on a game-by-game basis: Quality Starts and Game Score. Both measures are attempting in some way to look at the quality of each outing the starter has, rather than looking at the average or cumulative performance over the course of year like ERA does. But both measures have their weaknesses as total measures of pitching performance.
The arguments against Quality Starts are well known. Detractors point out that the worst qualifying outing -- 6 innings and 3 earned runs -- is not "quality" at all. A related objection is that Quality Starts makes no attempt to quantify the degree of quality a start has -- 6 innings, 3 runs is the same as 8 innings, 2 runs which is the same as a 9-inning shutout.
Partly in answer to these objections, Bill James developed the Game Score, which combines a starter's box score numbers (IP, H, ER, R, BB, K) using weights, where the weights are assigned such that the league average score is around 50, the best imaginable score is around 100, and the worst imaginable score (by someone outside the state of Colorado) is around 0. Game Score is acknowledged as an interesting measure of "game domination" by a starter, but it has weaknesses as a total measure of starter quality (i.e., his contribution to team victories): it's too dependent on strikeouts, possibly too dependent on hits and walks (after all, the number of runs given up is really the only thing that matters), and it isn't park-adjusted.
Despite the weaknesses of these two measures, looking at a pitcher's starts game-by-game is still a good idea. Looking at each start's contribution to winning, rather than cumulative run-prevention over the course of a year (ERA or Pitching Runs), can help us answer questions like: Given equal ERAs, do some pitchers pitch in a way that will tend to win more games than other pitchers? In particular, is it better for a starter to be flaky -- either very good or very bad on a given day -- or consistently average? Does the park have a smaller influence on the value of the start when the start is very good or very bad?
So here's what we'd like out of a stat measuring the quality of a start:
I've developed a couple of measurements that meet these five requirements. (Actually, the ideal stat would also be very easy to compute, but hey, 5 out of 6 isn't bad, right?). Support-Neutral Wins and Support-Neutral Losses (SNW and SNL) measure the expected number of wins and losses a pitcher would have with his outings, if he got average support from his offense and his bullpen. Support-Neutral Value Added (SNVA) measures the total number of games that an average team would win given the pitcher's starts, over the number of games they'd win with a league average starter. All of these stats are computed using only the number of innings pitched, number of runs given up, and the park the game was pitched in. SNVA may be a slightly more accurate measure of a starter's actual value compared to league average, but the SNW/L record has the advantage of being flexible and more understandable. Both of them, in my opinion, constitute an improvement over Thorn and Palmer's Pitching Runs as a total measure of starter worth.
Support-Neutral Wins and Losses
Support-Neutral Wins is calculated by determining the probability that a pitcher would get the win for each start he has, and then summing up the individual probabilities over all of his starts. The sum gives you the number of wins a pitcher could expect to get for an average team, given his performances. A "performance" here consists only of the number of innings pitched, the number of runs (not earned runs) given up, the park in which the game was played, and whether the pitcher was at home or on the road -- SNW assumes that these are the only things which influence whether the pitcher wins or loses.
The rest of this section describes the formulas that are used to calculate SNW; readers who aren't interested in the specific methods of calculation are welcome to skim or skip to the next section.
To calculate the probability that a pitcher wins the game, we just need to look at the definition of a win: A starting pitcher wins the game if his team has the lead when he's taken out of the game, and they never relinquish that lead. So, for a given outing by the starter, the probability that he gets the win is just the probability that his team will take the lead (score more runs than the starter gives up) by the time he's removed times the probability that they'll hold that lead until the game is over.
To put this into a formula, we just need to determine and add up the probabilities of all the different ways his team can take and hold a lead:
where
SNW(i, r) is the probability a starter who goes i innings and gives up r runs will get the win, given an average team playing behind him.
PScore(i, r) is the probability that an average team will score r runs in i innings.
PHold(k, i) is the probability that an average team will hold a k-run lead (without ever relinquishing it) for the i remaining innings until the end of the game.
The above formula is actually a simplification of the formula I use in my software to calculate SNW (I'll refer to the formula in my software as the "real" SNW formula). In order to make it easier to explain, I made a few assumptions to get the formula above. First, that formula assumes that the starter comes out of the game after pitching a full inning (i.e., he pitches no extra thirds of an inning). The formula is complicated somewhat when thirds of an inning are taken into account, but the same general idea applies: his team must be leading when he comes out, and his team must hold the lead for the extra thirds in the inning he leaves, plus all the rest of the remaining innings. The real SNW formula does take thirds of an inning into account.
Second, the above formula doesn't explicitly take the park into account. To take park effects into account, we need to make SNW, PScore, and PHold be functions of the park in which the game is played. A hitter's park should inflate the probabilities that an average team will score a high number of runs, and a pitcher's park should do the opposite. The real SNW formula does take park into account. I talk a little more about my handling of park effects in the Appendix.
Third, the above formula doesn't take into account whether the starter is pitching at home or on the road. Maybe contrary to intuition, this does make a difference. Consider a starter who leaves after pitching the 7th inning: if he's at home, he's pitched the top of the 7th, so he gets credit for the runs his team scored in the first 6 innings, plus the runs they score in the bottom of the 7th; if he's on the road, however, he pitched the bottom of the 7th, so he gets credit for the runs his team scored in the first 7 innings, plus the runs they score in the top of the 8th. So, all other things being equal, it's easier for pitchers to get wins (and harder for them to get losses) when they pitch on the road. The formula above is for a pitcher pitching at home, and the road formula is slightly different. The real SNW formula does take home/road status into account.
Finally, the above formula doesn't quite reflect the full definition of a pitcher's win -- a starter can't get the win unless he goes 5 innings or more. Presumably, this extra condition was put into the win rule to reduce the number of undeserving starters getting lucky wins. But when you're assigning fractions of a win, rather than 1 win or 0 wins, there's no possibility of getting lucky. So, the real SNWL formula does not take the five-inning condition into account, although, for the purposes of comparison, I do calculate an expected win (E(W)) number which is equal to 0 if the pitcher goes less than 5 innings and equal to SNW otherwise.
Let's finish off the formula above. PScore is easy to find recursively, provided you know an average team's single-inning scoring distribution, PInningScore:
where
PInningScore(r) is the probability that an average team will score r runs in an inning.
PHold is a little more complicated, since you have to see to it that the pitcher's team never relinquishes the lead. Still, it's not too hard to reduce it to the following (below, "tr" stands for the number of runs the pitcher's team scores in an inning, and "or" stands for the number the opposing team scores in an inning):
The only remaining unknown is the single-inning scoring distribution, PInningScore. But that's readily available from linescores of past games. The scoring distribution (separate distributions for each league) I'm using right now was taken from a few weeks of linescores in USA TODAY from late-April and early-May of 1992. I'll probably be able to get a more accurate distribution someday, but I'm sure that this one is close enough.
The SNL value for a single start is calculated analogously to SNW.
Support-Neutral Value Added
SNW and SNL gives us a nice way of getting a "fair" W/L record for a starter, which can then be used to compare to his actual W/L record, or a replacement-level winning percentage, etc. (see the Results section). But these numbers calculate how likely it is that the pitcher will win or lose the game -- i.e., get the "W" or "L" next to his name in the box score. A related but slightly different notion is the likelihood that the team will win when a pitcher takes the mound. In measuring the starter's contribution to team victories, we'd like to evaluate how much the outing by the starter changes the team's chance of winning from what it was at the beginning of the game (which I'll assume to be 50%). This is what SNVA is designed to measure.
Not surprisingly, the formula for SNVA looks pretty similar to the formula for SNW:
where
SNVA(i, r) is the difference between an average team's chance of winning after the starter has left after pitching i innings and giving up r runs, and their chance of winning at the beginning of the game (50%).
PScore(i, r) was defined above
PATWin(r, i) is the chance that an average team will eventually win the game given that there are i innings left and the difference between their score and their opponents' score is r.
Also not surprisingly, PATWin looks a lot like PHold:
What SNVA gives us (when summed over all a pitcher's starts) is the number of games in the standings he's worth to his team above the average starter. Of course, this is exactly the same unit (games above the average player) that all of Total Baseball's [1] measurements are in. So it'll be interesting to compare SNVA to Thorn and Palmer's Adjusted Pitching Runs1 to see how well they correlate and also where the differences lie.
Results
Best, worst, luckiest, and unluckiest starters of 1992
That's enough of the gory details of the calculation of the stats. Let's look at the fun stuff -- what the stats tell us about real pitchers. I tracked all starting pitchers in the majors over the 1992 season, and Tables 1 and 2 show the top pitchers in both leagues for 1992. Each table shows the pitcher's Support-Neutral Wins (SNW), Losses (SNL), and Winning Percentage (SNPct), followed by his actual win-loss record (W, L), his runs allowed per 9 innings (RA), his Adjusted Pitching Runs
(APR), and his Support-Neutral Value Added (SNVA). Interestingly, Greg Maddux, with the fabulous year he had pitching in Wrigley, was the only pitcher in either league who came close to "deserving" to win 20 games.
Pitcher |
Team |
SNW |
SNL |
SNPct |
W |
L |
RA |
APR |
SNVA |
Mussina |
BAL |
17.2 |
7.8 |
.688 |
18 |
5 |
2.61 |
47.0 |
4.60 |
Clemens |
BOS |
17.5 |
8.5 |
.674 |
18 |
11 |
2.92 |
43.8 |
4.39 |
Appier |
KCR |
15.2 |
6.6 |
.698 |
15 |
8 |
2.55 |
42.6 |
4.08 |
Guzman,Ju |
TOR |
13.4 |
6.4 |
.679 |
16 |
5 |
2.79 |
32.3 |
3.34 |
Nagy |
CLE |
16.3 |
9.9 |
.623 |
17 |
10 |
3.25 |
33.3 |
3.11 |
Eldred |
MIL |
8.2 |
2.4 |
.776 |
11 |
2 |
1.88 |
28.1 |
2.81 |
McDowell |
CHI |
16.3 |
10.7 |
.602 |
20 |
10 |
3.28 |
30.5 |
2.53 |
Smiley |
MIN |
16.0 |
10.5 |
.603 |
16 |
9 |
3.47 |
28.3 |
2.75 |
Navarro |
MIL |
15.8 |
10.8 |
.595 |
17 |
11 |
3.59 |
22.8 |
2.45 |
Abbott,J |
CAL |
13.6 |
8.6 |
.612 |
7 |
15 |
3.11 |
27.7 |
2.36 |
Viola |
BOS |
15.8 |
11.2 |
.586 |
13 |
12 |
3.74 |
21.4 |
2.35 |
Fleming |
SEA |
15.1 |
10.7 |
.586 |
17 |
10 |
3.73 |
19.7 |
2.10 |
Perez,M |
NYY |
14.9 |
10.5 |
.586 |
13 |
16 |
3.42 |
26.3 |
1.90 |
Wegman |
MIL |
15.6 |
11.5 |
.576 |
13 |
14 |
3.58 |
24.4 |
2.06 |
Erickson |
MIN |
13.3 |
9.8 |
.574 |
13 |
12 |
3.65 |
20.8 |
1.75 |
Bosio |
MIL |
14.3 |
11.1 |
.563 |
16 |
6 |
3.89 |
13.6 |
1.52 |
Key |
TOR |
13.3 |
10.4 |
.561 |
13 |
13 |
3.66 |
18.0 |
1.42 |
Brown,K |
TEX |
15.5 |
12.9 |
.545 |
21 |
11 |
3.96 |
14.5 |
1.18 |
Welch |
OAK |
8.0 |
5.8 |
.580 |
11 |
7 |
3.42 |
9.7 |
0.91 |
Rasmussen |
KCR |
3.0 |
0.8 |
.785 |
4 |
1 |
1.67 |
11.4 |
1.09 |
Table 1: Top 20 AL Starters in 1992, ranked by SNW-SNL
Pitcher |
Team |
SNW |
SNL |
SNPct |
W |
L |
RA |
APR |
SNVA |
Maddux,G |
CHI |
19.5 |
7.4 |
.724 |
20 |
11 |
2.28 |
53.9 |
5.75 |
Tewksbury |
STL |
16.1 |
7.3 |
.687 |
15 |
5 |
2.45 |
38.5 |
4.12 |
Schilling |
PHI |
13.9 |
6.8 |
.670 |
12 |
9 |
2.59 |
31.1 |
3.37 |
Morgan |
CHI |
16.3 |
9.5 |
.632 |
16 |
8 |
3.00 |
30.4 |
3.22 |
Rijo |
CIN |
13.9 |
8.1 |
.632 |
15 |
10 |
2.86 |
28.5 |
2.57 |
Smoltz |
ATL |
16.6 |
11.0 |
.601 |
15 |
12 |
3.28 |
25.1 |
2.67 |
Glavine |
ATL |
15.1 |
9.8 |
.608 |
20 |
8 |
3.24 |
23.9 |
2.71 |
Martinez,D |
MON |
14.5 |
9.1 |
.613 |
16 |
11 |
2.98 |
24.0 |
2.50 |
Swindell |
CIN |
13.8 |
8.5 |
.619 |
12 |
7 |
3.05 |
24.5 |
2.56 |
Swift |
SFG |
10.4 |
5.1 |
.670 |
9 |
3 |
2.36 |
23.6 |
2.51 |
Drabek |
PIT |
15.9 |
10.8 |
.595 |
15 |
11 |
2.95 |
26.7 |
2.32 |
Fernandez,S |
NYM |
13.6 |
8.8 |
.608 |
14 |
11 |
2.81 |
24.7 |
2.25 |
Hill |
MON |
14.0 |
10.0 |
.583 |
16 |
9 |
3.14 |
19.3 |
1.93 |
Leibrandt |
ATL |
13.5 |
9.5 |
.586 |
15 |
7 |
3.68 |
11.8 |
2.02 |
Smith,P |
ATL |
5.6 |
2.1 |
.724 |
7 |
0 |
2.22 |
16.2 |
1.69 |
Wakefield |
PIT |
6.4 |
3.3 |
.656 |
8 |
1 |
2.54 |
13.8 |
1.42 |
Rivera |
PHI |
6.5 |
3.7 |
.639 |
7 |
3 |
2.95 |
10.8 |
1.34 |
Benes |
SDP |
14.0 |
11.3 |
.553 |
13 |
14 |
3.50 |
10.7 |
1.22 |
Portugal |
HOU |
6.6 |
4.0 |
.621 |
5 |
3 |
2.69 |
12.5 |
1.18 |
Avery |
ATL |
14.0 |
11.5 |
.549 |
11 |
11 |
3.66 |
14.8 |
1.15 |
Table 2: Top 20 NL Starters in 1992, ranked by SNW-SNL
On the flip-side, Tables 3 and 4 show the worst 10 starting pitchers in 1992 for each league.2 Not surprisingly, many of these guys showed up in different uniforms in 1993, several on expansion teams.
Pitcher |
Team |
SNW |
SNL |
SNPct |
W |
L |
RA |
APR |
SNVA |
Armstrong |
CLE |
5.2 |
11.5 |
.313 |
3 |
15 |
6.37 |
-28.4 |
-3.08 |
Milacki |
BAL |
4.5 |
9.5 |
.320 |
6 |
8 |
6.18 |
-21.8 |
-2.32 |
Terrell |
DET |
2.9 |
7.5 |
.280 |
3 |
6 |
6.98 |
-22.9 |
-2.26 |
Slusarski |
OAK |
2.5 |
6.9 |
.265 |
5 |
5 |
6.25 |
-18.7 |
-2.05 |
Sanderson |
NYY |
9.9 |
14.0 |
.414 |
12 |
11 |
5.40 |
-22.4 |
-2.02 |
Aldred |
DET |
2.4 |
6.5 |
.273 |
2 |
7 |
7.63 |
-21.7 |
-1.89 |
McCaskill |
CHI |
10.2 |
14.2 |
.417 |
12 |
13 |
5.00 |
-16.1 |
-1.92 |
Wells |
TOR |
3.4 |
6.9 |
.332 |
6 |
7 |
7.70 |
-27.7 |
-1.81 |
Stieb |
TOR |
3.4 |
6.7 |
.337 |
3 |
6 |
5.92 |
-13.3 |
-1.50 |
Otto |
CLE |
3.9 |
7.2 |
.354 |
5 |
9 |
6.75 |
-19.8 |
-1.57 |
Table 3: Bottom 10 AL Starters in 1992, ranked by SNW-SNL
Pitcher |
Team |
SNW |
SNL |
SNPct |
W |
L |
RA |
APR |
SNVA |
Bowen |
HOU |
0.6 |
6.1 |
.094 |
0 |
7 |
12.22 |
-31.3 |
-2.61 |
Wilson,T |
SFG |
7.0 |
11.3 |
.384 |
8 |
14 |
4.79 |
-18.5 |
-2.03 |
Abbott,K |
PHI |
4.5 |
8.3 |
.352 |
1 |
14 |
4.92 |
-11.4 |
-1.84 |
Martinez,R |
LAD |
7.3 |
11.1 |
.397 |
8 |
11 |
4.90 |
-19.1 |
-1.84 |
Henry,B |
HOU |
8.3 |
11.7 |
.414 |
6 |
9 |
4.40 |
-12.4 |
-1.57 |
Young,A |
NYM |
2.8 |
6.2 |
.313 |
1 |
7 |
5.79 |
-16.8 |
-1.63 |
Black |
SFG |
8.6 |
11.9 |
.420 |
10 |
12 |
4.47 |
-14.7 |
-1.54 |
Hershiser |
LAD |
10.2 |
13.3 |
.434 |
10 |
15 |
4.31 |
-12.5 |
-1.60 |
Hammond |
CIN |
7.0 |
10.0 |
.409 |
7 |
10 |
4.61 |
-7.4 |
-1.36 |
Blair |
HOU |
1.4 |
4.5 |
.241 |
1 |
5 |
7.51 |
-16.8 |
-1.52 |
Table 4: Bottom 10 NL Starters in 1992, ranked by SNW-SNL
This method also allows you to evaluate the level of luck a pitcher experienced in his W/L record -- i.e. it allows you to look at how much a pitcher's actual W/L record differs from his expected W/L record given the way he pitched. Tables 5 through 8 show the luckiest and unluckiest starters in each league in 1992. No one should be surprised that Jack Morris, who compiled a 21-6 record despite a 4+ ERA, was far and away the luckiest starter in either league last year. SNW/L evaluation shows that you'd expect his 1992 performance to produce a 13-13 mark if he had gotten average support. Equally unsurprising is the result that Jim Abbott was the unluckiest pitcher in either league. The Angels gave him enough support only for a miserable 7-15 record, while his pitching actually merited something closer to 13-9.
Table 5: Luckiest 10 AL Starters in 1992, ranked by W-E(W) + E(L)-L |
Table 6: Unuckiest 10 AL Starters in 1992, ranked by W-E(W) + E(L)-L |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table 7: Luckiest 10 NL Starters in 1992, ranked by W-E(W) + E(L)-L |
Table 8: Unluckiest 10 NL Starters in 1992, ranked by W-E(W) + E(L)-L |
League total numbers
In theory, the support-neutral record of the entire league should come close to the actual win-loss record of the league, and in fact, in 1992, SNW/L did appear to predict league W/L pretty well. Table 9 shows both the expected and actual W/L totals for each league in 1992. The National League's record corresponded very well to the record expected by the model, with no-decisions being underpredicted only slightly by SNW/L. The American League is predicted a little less successfully -- there were nearly 30 more wins in the league than expected, and nearly 10 more losses than expected. I believe that part of the discrepancy between expected record and actual record can be explained by the fact that relief pitchers prevented runs better than starters in 1992. Since starters are competing for the (actual) decision primarily with the other starter, it makes sense that starters would get a few more (actual) wins than predicted by a model which has them competing with league average pitching for the decision.
E(W) |
E(L) |
E(Pct.) |
W |
L |
Pct. |
|
NL |
660.9 |
690.3 |
.489 |
655 |
678 |
.491 |
AL |
776.1 |
846.7 |
.478 |
805 |
837 |
.490 |
Table 9: Expected and Actual records of all starters in the leagues
Value of "flaky" and "steady" pitchers
Do the Support-Neutral stats tell us anything that Thorn and Palmer's Adjusted Pitching Runs weren't already telling us? Since both APR and SNVA are trying to measure exactly the same thing (albeit by different methods), we'd expect there to be a pretty strong correlation between them. There is. For most pitchers, SNVA (whose unit is "games above average") is approximately equal to one-tenth of APR (whose unit is "runs above average"). This is what you'd expect given the well-known result that each 10 runs prevented (or gained) leads on average to about 1 extra win in the standings (see, e.g., [2]). However, there are plenty of cases where APR and SNVA give significantly different evaluations. Look at the 1992 records of Charlie Leibrandt and Melido Perez:
APR SNVA Leibrandt 11.8 2.02 Perez,M 26.3 1.90
APR evaluates Perez as being 14.5 runs -- about one-and-a-half games -- better than Leibrandt. However, SNVA shows that, when the pitchers' performance is evaluated game-by-game, Leibrandt was actually a little better than Perez.
The key to this discrepancy between the two measurements is found in the amount of consistency the two pitchers exhibited in their starts. Perez was a model of consistency last year; he rarely got bombed, but he also was rarely dominating. Leibrandt, on the other hand, was one of the least consistent pitchers in the majors. And that is the most surprising result I've seen so far from these SN stats: run-prevention stats such as ERA and APR tend to undervalue flaky pitchers, and overvalue consistent ones, at least when you consider them pitching for an average team. Tables 10 through 13 show the "flakiest" (most inconsistent) and "steadiest" (most consistent) pitchers in the leagues last year, as evaluated by the variance of the SNVA of their individual starts. You can see from those tables that APR pretty consistently underestimates a pitcher's value when the pitcher is flaky, and pretty consistently overestimates his value when he's steady. 9 of the 10 flakiest pitchers in both the NL and AL were underestimated by APR, and 8 of the 10 steadiest in the NL and 10 of the 10 steadiest in the AL were overestimated by APR. And the pitchers for whom there were really large discrepancies between APR and SNVA -- Leibrandt, Kyle Abbott, Gooden, Hammond, Sutcliffe, Perez, Kamieniecki, McDowell -- all showed up near the top of the predicted list.
The reason for this undervaluing is that APR counts all runs as equal, while in fact all runs do not contribute an equal amount toward winning/losing a game. In particular, Bill James did a study that showed that runs scored by a team after they've already scored 5 in a game do not contribute the same amount toward the probability of winning than those first 5 runs did [3]. So, pitchers who give up more than 5 runs in a couple of games will be undervalued by ERA and APR, because those really crummy outings probably weren't quite as crummy as ERA and APR would have you believe.
Table 10: Flakiest 10 NL Starters in 1992, ranked by variance of SNVA (15 starts minimum) |
Table 11: Steadiest 10 NL Starters in 1992, ranked by variance of SNVA (15 starts minimum) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Table 12: Flakiest 10 AL Starters in 1992, ranked by variance of SNVA (15 starts minimum) |
Table 13: Steadiest 10 AL Starters in 1992, ranked by variance of SNVA (15 starts minimum) |
As an example of this, consider a David Wells outing from 1992: he gave up 13 runs in 4+ innings. APR just subtracts his 13 runs from the number of runs a league average pitcher would have given up in those same 4 innings (about 2), and concludes that Wells was worth about -11 runs, or -1.1 games, in that start. Did Wells really cost the Blue Jays more than a game in the standings with that awful start? Of course not. He guaranteed them a loss, of course, but they had some chance of losing the game to begin with anyway -- about a 50% chance if you make the simplifying assumption that they're an average team. SNVA gives a far more reasonable value for Wells's start: it was worth about -0.5 games. That's as much as a single start can cost you. Wells didn't have the requisite 15 starts to show up in Table 12, but you can see from his record in Table 3 how much he was underestimated by APR.
Effect of the park on win probability
One other question I've been looking at is how the value of starts is influenced by park effects. Figure 1 shows the SNVA for a 9-inning complete game in both Atlanta's Fulton County Stadium (the NL's most extreme hitters' park in 1992) and the San Francisco's Candlestick Park (the NL's most extreme pitchers' park in 1992). We can see from the figure that the effect of the park on the value of the start is far less at the two extremes of start quality than it is for middle-of-the-road starts. The difference between Fulton County and Candlestick for the value of a 9-inning, 4-run start is almost four times as large as the difference between Fulton County and Candlestick for the value of a shutout.
Figure 1: SNVA for Fulton County Stadium (top line) and Candlestick
Park (bottom line), given that the starter pitched 9 innings
This would imply that methods of park adjustments which simply multiply a pitcher's "raw" value by a park factor might be over- or underestimating the park's actual effect on his value. Since the park's effect on very good or very bad starts is much less than on average starts, a reasonable hypothesis would be that very good or very bad pitchers deserve less of a boost (or less diminishment) to their rating than current park adjusment methods give them.
However, the preliminary investigation of this hypothesis I have done on real starting pitchers (with 1992 data) has failed to find much support for it. I'd still like to do some more work on this issue.
Weaknesses of the Approach
Here are a few of the problems with these measurements:
Conclusion
I've presented Support-Neutral Wins, Losses, and Value Added, three park- and league-adjusted measurements of the value of individual starts, and of starting pitchers. I feel these are a valuable addition to existing measurement methods, both because they can provide a measurement of pitcher worth in units which are familiar to all baseball fans (pitcher wins and losses) and because they seem to be a slightly more accurate measure of the true value of a start than existing methods.
Special thanks to Greg Spira, whose discussion sparked many of the ideas presented here. Thanks to David Tate and others on the Internet newsgroup rec.sport.baseball, who provided valuable feedback on the method. And thanks to my wife, Cindy, for reading this paper and giving me many useful suggestions.
References
[1] Thorn, J. and Palmer, P. (eds.), Total Baseball, 3rd edition, Harper Collins, New York, 1993.
[2] Thorn, J. and Palmer, P., The Hidden Game of Baseball, Doubleday Books, New York, 1985.
[3] James, B., The 1986 Bill James Baseball Abstract, Ballantine Books, New York, 1986, pp. 172-175.
Appendix: Park Effects
One possible way of incorporating park effect numbers into these measurements would be to take whatever final value the above formulas produce (SNW, SNL, or SNVA) and multiply it by some park effect constant for the pitcher's home park. This is essentially the approach Thorn and Palmer use in Total Baseball. But the method of calculating the Support-Neutral stats allows a potentially more informative use of park effects. Since park effects (as printed in Elias, e.g.) reflect how a park inflates or deflates average scoring ability, it makes sense to have the "average team" playing behind the pitcher effected by the park, and then calculate the likelihood that the pitcher's outing plus this park-adjusted average team will lead to a win. So for any game, the PInningScore (league average scoring) distribution is adjusted to reflect the park's effect on run scoring. The resulting number then reflects the park's effect on winning rather than cumulative run scoring/prevention.
The question then becomes: how do you translate a single park effect percentage like the ones in Elias (the only source of park effects I have) into an adjusted PInningScore distribution? There are an infinite number of ways to do this. The way I'm doing it now is to change the probability of scoring 0 runs by one factor, and change the probability of scoring i runs for i>1 by another factor, such that the total number of expected runs scored in an inning is increased/reduced by the Elias number. For example, if the Astrodome decreases scoring by 10%, I increase PInningScore(0) for the Astrodome by one factor, and decrease PInningScore(i) for i>1 by another factor, such that the expected single-inning score reflected by PInningScore is reduced by 10% from the park-neutral scoring distribution. If that isn't clear (and I'm sure it isn't), I should say that I don't think it makes much difference the exact method used.
Footnotes
1 Adjusted Pitching Runs is the basic metric which Thorn and Palmer (the authors of Total Baseball) use to evaluate pitchers. APR is the number of runs prevented by a pitcher that a league average pitcher would've given up. The APR that I'm using in this paper differs from Thorn and Palmer's statistic in two ways: 1) I'm using runs where Thorn and Palmer use earned runs, and 2) the method of park adjustment I use is a simplification of the one used in Total Baseball. It is included here for comparison with SNVA.
2 Actually, it's probably inaccurate to use the word "worst" here, since the method of ranking the pitchers -- ranking them according to SNW-SNL -- sets the baseline for comparison at league average (anyone below .500 gets a negative rating). Of course, it's quite possible for a below-average pitcher to still be valuable to his team. A better method of producing this list might have been to compare a pitcher's SN record to a lower baseline, e.g., a .450 pitcher. This would have left pitchers like Hershiser and McCaskill, who pitched a lot of innings at somewhat below-league-average performance, off of the lists in favor of other pitchers who pitched fewer innings but at further-below-average performance.