January 30, 1998
DTs vs. MLEs - A Validation Study
by Clay Davenport

When we first started cranking out the Baseball Prospectus a few years ago,
our goal was to produce a top-notch baseball guide. We want to produce
something which gives fans a solid grasp of the abilities and accomplishments
of the players at both the major league and minor league levels. While we
sought from the start to craft a book that people would enjoy reading, we also
seek to crank out the best available statistical tools for player analysis. For
us to continue to put out a top-notch book, we have to make sure that we're
using top-notch numbers.
That said, few of the statistical concepts you'll find in the book are wholly
original. Do you want to use hitting statistics to estimate runs scored? Well,
Bill James and Pete Palmer did that a long time ago, and there are books out
there now that use their methods, particularly Palmer's Total Baseball. Do you
want to convert minor league statistics to major league equivalents? Again,
these can be found in the STATS Minor League Handbook. Evaluations of fielding
statistics? Total Baseball, again.
What then do the Davenport Translations (or DTs, conceptually similar to James'
MLEs) and Equivalent Average (which measures run production from batting stats)
offer that these older tools don't? First off, these tools are not simply
copies of earlier works, but independent attempts to measure the same problems.
There is never anything wrong with having more than one option to get your
result. The argument has been made that the older methods, by virtue of their
being older, should have priority, an idea whose reductio ad absurdum would be
to consider runs scored per game played the ultimate measure of individual
accomplishment, like it was in the 1860s.
I could allow myself to make a similar but opposite argument: the earlier
statistics should be cast aside because they don't do their job as well as the
ones you'll find in the Baseball Prospectus. I don't seriously believe that
this should be done, of course. But the truth is, the BP analysis stats are
better than their counterparts. We haven't played this up more in our books so
far, for which the fault is probably mine. But validation tests make for pretty
boring reading, and take up space that I preferred to use on outlining a
totally new fielding approach, as well as on more players, more players, and
some more players. So, here on the webpage, I'll tackle some of those issues.
For this article, I am going to look at how the statistics for a player compare
from one year to the next: the goal being to show, as James said ten years ago,
that minor league statistics, properly interpreted, are just as useful as major
league statistics for projecting future performance. This
is still true, and DTs for minor league performance pass the test of being just
as useful as major league statistics, and beyond that, DTs do this better than
the MLEs put out by STATS.
Chalking The Baselines
The first order of business is to determine just how variable major league
performances themselves are.
Table 1 shows the changes for major league players. The players used are those
who had at least 150 AB in any of the 1990 through 1996 seasons, and had 150 AB
again the next season (i.e., 1991-1997). The second column is the number of
players in that specific age range. The stats used are batting average (BA);
walk rate (WR), calculated as walks per AB+BB; and isolated power (ISO), which
is slugging average minus batting average. "Net" is the error score for the
player, and is equal to two times the BA error, plus the walk rate error, plus
the isolated power error, all in absolute values. This is proportionally
correct for each portion covered; it is essentially on-base percentage plus
slugging percentage, with the batting average pulled out as a separate entry.
All statistics are the translated, not real, ones. Since there are over 1700
players included, what I've done is summarize the statistics by age; this
refers to the player's age in the first year of the pair.
Average change Standard deviation
Age N BA WR ISO Net BA WR ISO Net
20- 6 -13 8 - 7 93 26 17 37 49
21 16 -12 7 5 125 43 19 43 56
22 39 12 1 9 122 41 30 46 68
23 54 - 6 1 3 106 32 25 44 52
24 130 1 3 9 113 35 25 44 55
25 165 3 6 3 111 35 26 46 57
26 185 - 6 - 3 1 111 32 24 53 59
27 196 - 1 3 1 114 35 24 47 58
28 173 - 1 2 - 1 111 33 25 50 62
29 168 - 4 - 4 - 4 117 36 26 49 63
30 159 2 0 2 110 36 26 45 58
31 130 - 7 - 1 -11 113 35 27 46 62
32 95 - 5 2 - 7 118 35 24 50 53
33 69 -10 4 - 6 118 36 26 42 56
34 50 - 4 -13 -20 108 29 27 45 47
35 33 1 0 13 115 34 29 46 65
36 26 3 1 -21 106 26 32 46 45
37 19 -27 - 9 -21 103 24 20 29 53
38+ 22 - 7 2 -13 133 44 26 44 63
All 1745 - 2 0 - 1 113 35 26 48 58 All players 1990-97
94-96 761 - 2 0 - 4 113 35 26 47 58 All players 1994-97
19-26 605 - 1 2 4 112 35 25 48 58 Players aged 19-26 in 1990-96
Take-home lessons:
- Demographically, there is a large, broad peak of players aged 25-30, with
rapid declines on either side. Perhaps not coincidentally, the 31-32 is the
first transition featuring large dropoffs in performance, although some might
argue that for the 29-30 transition.
- The predictability of the various stats is relatively independent of age,
although standard deviations for slugging percentage are slightly higher during
the 26-29 peak.
- Neither the 1994-1996 subset nor the age 19-26 are more variable than the
overall set, although the slugging average means differ somewhat.
- The "normal" values for variability among major leaguers, for comparison
with minor league DTs, is about 35 points of BA, 25 on walk rate, 45-50 on
power.
Making Contact
The real meat here is the chart of players who had at least 200 AB in AA/AAA in
one of the years during the period 1994-96, and who had at least 150 AB in the
majors the following year; the 200 figure was chosen to conform with STATS'
cutoffs for showing MLEs. In each case, the DT or MLE stats reflect play in AA
and AAA combined, while ignoring play in the majors or lower minor leagues for
the season shown. 1994 was chosen to minimize the effects of the strike (which
could really foul up the 1993 minor to 1994 major dataset), as well as to
minimize the impact of the Offensive Surge which, again, was mostly in the 1992
to 1993 and 1993 to 1994 seasons. This should not affect the DTs much, but the
MLEs could be floored by guessing the wrong overall level of league offense.
First off I need to remind everyone of a caveat that is true for both the DTs
and MLEs. They are not predictions; they are assessments of past performance.
However, since both aim to match a major league season, it is reasonable to
test their accuracy by seeing how different a DT/MLE is from the following
major league season, compared to how well one major league season is from
another.
#Rt is the number of times the Net score for a player was better using one set
or the other. For instance, Manny Alexander's 1994-95 seasons was off in the
DTs by 7 BA points, 39 WR points, and 24 ISO points, for a net of
(2*7)+39+24=77 points.
His MLE errors were 16, 60, and 8, for a net error of 100. The DTs get credit
for "one right."
Averages Standard deviations
Year #Rt BA WR ISO Net BA WR ISO Net
1994-95 27 DT -3 1 1 120 40 28 41 60
52 pl. 25 MLE -5 19 -4 123 37 36 39 60
1995-96 30 DT -4 1 -7 110 37 27 45 64
47 pl. 16 MLE -1 13 -5 121 37 33 49 66
1996-97 31 DT -7 -2 10 102 27 27 43 55
49 pl. 18 MLE -9 3 -3 112 28 34 46 60
1994-96 88 DT -5 0 1 111 35 27 43 60
148 pl. 59 MLE -5 12 -4 119 34 35 45 62
1994-97 148 MjAge -2 2 4 112 35 25 46 57

The last line, MjAge, is an age-adjusted version of the major league norms,
using the averages and standard deviations for each age slot of the major
league table and the number of players in the minor-major survey in those same
slots. Relative to those, the DT minor/major adjustment is slightly too
generous with batting average, but shortchanges players a wee bit in walks and
power. It is just as good, even a little better, on the Net score, although it
is a little less reliable (having a higher standard deviation). Variability is
a little higher in walk rate, but actually a little better in slugging, with no
difference in BA.
There is nothing here to make anyone believe that this year's minor league DT
is any worse of a predictor for next year's major league DT than this year's
major league DT.
The superiority of the DT method to the MLEs is pretty clear, especially over
the last two seasons. The DTs were more "right" than the MLEs by an 88-59
margin over the last three years (Jose Herrerra, in 1995-96, had a 45 point net
from each method for a tie); a split that large or greater over 147 entries
will occur by chance less than 1% of the time if the two have a genuinely equal
chance of being right. The DTs clearly do better in the net score totals,
having both a smaller average error and a smaller standard deviation. The
differences in BA between the two are insignificant; in isolated power the
differences are significant at the 95% (but not 99%) level; the differences in
walk rate are clearly significant.
Its those walk rates where the differences in the methods really arise. In the
past, David Grabiner has written that MLEs do not handle walks (and strikeouts,
which the DTs don't attempt to cover) particularly well. The numbers above,
which show the standard deviation in walk rates to be much higher for MLEs than
for previous major league seasons, reinforce that conclusion. Yet the
difference in overall results is not entirely a function of walks. If you
ignore the walk rate term in the Net score, the DTs would still win the "right"
category by 78-69 (with one tie agin, this time Mike Lieberthal), thanks to a
slight edge in handling isolated power. That wouldn't be a statistically
significant result; however, it's still good enough to support a claim that DTs
are doing at least as well as the MLEs.
The confounding part of this study is that DTs do have a built-in advantage.
DTs from any season always use the same baseline level of offense and the same
park factors. A 1995 MLE, by contrast, is set to the offensive level of 1995,
either AL or NL, depending on what league the player's parent club was in, and
the park factor of the parent club. Lyle Mouton's 1994 MLE was for Yankee
Stadium, for the 1994 American League, but his 1995 comparison season was
Comiskey for the 1995 AL season.
Proofing the Boxscore
Here's a listing of all 148 minor-major players in the study.
"Age" and "Yr" represent the first year of the two-year samples. "Deltas" are
the difference between a statistic in Year2 minus Year1.
DTs are taken from formulations used in the 1998 Baseball Prospectus. For
example, Manny Alexander's 1994 DT called for a .228 batting average, .254
on-base average, and .339 slugging; his 1995 DT was .234/.291/.321. That breaks
down to a .228 BA, .034 WR, and .111 ISO for 1994, .234/.073/.087 for 1995; so
the differences are +7, +39, and -24. (Note that, if OBA is based only on AB,
H, and BB, then walk rate is (OBA-BA)/(1-OBA)). Net is 2*7+39+24, or 77.
MLE deltas repesents the difference between actual major league performance in
year2 and the MLEs published in the Minor League Handbook for that year.
DT MLE DT MLE DT Deltas MLE Deltas
Last First Age Yr Net Right BA WR ISO BA WR ISO
ALEXANDER MANNY 23 1994 77 100 1 0 7 39 -24 16 60 -8
ALFONZO EDGARDO 20 1994 112 120 1 0 15 -56 -26 16 -58 -30
ANDERSON GARRETT 22 1994 137 131 0 1 37 -7 56 40 5 46
ANDREWS SHANE 22 1994 144 126 0 1 -40 -55 9 -15 -64 32
ASHLEY BILLY 23 1994 232 211 0 1 -53 -4 -122 -45 35 -86
BATES JASON 23 1994 68 101 1 0 10 10 38 8 37 48
BECKER RICH 22 1994 110 134 1 0 -23 -39 -25 -43 -10 -38
BROGNA RICO 24 1994 145 160 1 0 54 28 9 63 2 5
BUFORD DAMON 24 1994 163 221 1 0 -35 64 -29 -39 107 -36
CIRILLO JEFF 24 1994 119 78 0 1 -32 24 -31 -5 48 -20
CORDOVA MARTY 24 1994 99 160 1 0 -20 5 54 -41 45 33
CROMER TRIPP 26 1994 97 103 1 0 -21 -23 -32 -25 -19 -34
DURHAM RAY 22 1994 93 62 0 1 -16 -23 -38 -14 2 -32
EVERETT CARL 24 1994 155 211 1 0 -7 73 68 -28 99 56
FABREGAS JORGE 24 1994 168 180 1 0 64 16 24 55 33 37
GIAMBI JASON 23 1994 103 134 1 0 25 24 29 23 60 28
GIL BENJI 21 1994 85 54 0 1 -32 -10 11 -10 5 29
GONZALEZ ALEX 21 1994 93 76 0 1 -27 4 35 -26 9 15
GOODWIN CURTIS 21 1994 37 35 0 1 4 -11 18 -5 4 21
GOODWIN TOM 25 1994 55 71 1 0 16 19 4 9 50 -3
GREEN SHAWN 21 1994 185 173 0 1 -34 -31 86 -41 -26 65
GRUDZIELANEK MARK 24 1994 100 156 1 0 -29 -7 -35 -42 21 -51
HIGGINSON BOB 23 1994 135 198 1 0 -39 48 -9 -44 59 -51
HUNTER BRIANL 23 1994 39 31 0 1 0 -16 -23 -13 -3 -2
HUSON JEFF 29 1994 85 77 0 1 -37 -7 -4 -34 8 1
JOHNSON CHARLES 22 1994 59 122 1 0 6 9 -38 18 45 -41
JOHNSON MARK 26 1994 163 149 0 1 -46 21 50 -43 47 16
JONES CHRIS 28 1994 109 39 0 1 32 2 43 -16 3 -4
LEWIS MARK 24 1994 283 242 0 1 108 30 37 96 21 29
LOCKHART KEITH 29 1994 264 178 0 1 77 -26 84 54 1 69
MABRY JOHN 23 1994 168 178 1 0 61 0 -46 67 6 -38
MANTO JEFF 29 1994 127 67 0 1 -27 -36 37 -15 -35 -2
MASTELLER DAN 26 1994 49 131 1 0 -9 28 3 -28 50 -25
MERULLO MATT 28 1994 48 73 1 0 8 13 -19 1 28 -43
MILLER ORLANDO 25 1994 158 166 1 0 65 28 0 51 52 12
MOUTON LYLE 25 1994 137 73 0 1 32 26 47 13 43 -4
NEVIN PHIL 23 1994 124 134 1 0 -46 7 -25 -36 57 -5
NIEVES MELVIN 22 1994 149 136 0 1 -41 -35 32 -56 2 22
O'LEARY TROY 24 1994 78 69 0 1 -10 -54 -4 7 -53 -2
OFFERMAN JOSE 25 1994 125 160 1 0 39 19 28 19 64 58
OLIVA JOSE 23 1994 300 253 0 1 -111 11 -67 -86 25 -56
PAQUETTE CRAIG 25 1994 43 53 1 0 -11 -11 10 -20 -2 -11
PEGUES STEVE 26 1994 109 72 0 1 -33 0 -43 -20 3 -29
PERRY HERB 24 1994 81 22 0 1 20 -18 23 8 1 -5
PHILLIPS J.R. 24 1994 211 210 0 1 -63 -15 -70 -59 -8 -84
STAHOVIAK SCOTT 24 1994 64 90 1 0 6 -21 -31 -16 -1 -57
TAVAREZ JESUS 23 1994 107 115 1 0 42 9 14 37 22 19
TIMMONS OZZIE 23 1994 20 57 1 0 8 0 -4 22 8 5
TUCKER MIKE 23 1994 84 42 0 1 3 -34 -44 15 0 -12
VERAS QUILVIO 23 1994 134 201 1 0 15 53 51 30 85 56
WILLIAMS EDDIE 29 1994 103 201 1 0 -22 -11 -48 -44 14 -99
YOUNG KEVIN 25 1994 90 56 0 1 -36 -14 -4 -21 -6 8
TOTAL 27 25
DT MLE DT MLE DT Deltas MLE Deltas
Last First Age Yr Net Right BA WR ISO BA WR ISO
ALLENSWORTH JERMAINE 23 1995 42 89 1 0 7 17 11 9 35 36
ARIAS GEORGE 23 1995 97 129 1 0 7 -33 -50 -10 -19 -90
AURILIA RICH 23 1995 150 117 0 1 -38 -25 -49 -29 -23 -36
BARTEE KIMERA 22 1995 43 66 1 0 8 1 -26 22 19 -3
BATISTA TONY 21 1995 111 159 1 0 42 12 -15 59 25 -16
BENARD MARVIN 24 1995 31 44 1 0 -8 4 -11 -15 14 0
BRAGG DARREN 25 1995 142 156 1 0 -36 50 20 -29 70 28
BURNITZ JEROMY 26 1995 107 88 0 1 -24 39 20 -15 57 1
CEDENO ROGER 20 1995 29 53 1 0 -5 -15 -4 -6 13 28
CIANFROCCO ARCHI 28 1995 64 42 0 1 27 -2 8 16 6 4
CLARK TONY 23 1995 156 164 1 0 4 -40 108 12 -56 84
COOMER RON 28 1995 62 37 0 1 6 9 41 -2 26 7
DAMON JOHNNY 21 1995 192 171 0 1 -35 -67 -55 -39 -48 -45
DELGADO CARLOS 23 1995 157 95 0 1 -56 -8 -37 -25 4 -41
DYE JERMAINE 21 1995 50 87 1 0 8 -22 12 27 -16 17
EVERETT CARL 25 1995 155 196 1 0 -28 34 -65 -41 43 -71
FOX ANDY 24 1995 249 238 0 1 -81 -9 -78 -81 -3 -73
GREENE WILLIE 23 1995 73 114 1 0 12 9 40 20 25 49
GUTIERREZ RICKY 25 1995 124 68 0 1 55 2 -12 24 17 -3
HERRERA JOSE 22 1995 45 45 0 0 -9 -10 17 8 1 28
HUSKEY BUTCH 23 1995 97 117 1 0 3 -20 71 12 -21 -72
JETER DEREK 21 1995 92 96 1 0 16 -33 27 19 -27 31
KENDALL JASON 21 1995 32 13 0 1 -1 -22 -8 0 -12 -1
LAWTON MATT 23 1995 48 69 1 0 -11 0 -26 -4 -8 -53
LIEBERTHAL MIKE 23 1995 120 121 1 0 -8 -66 38 -1 -67 52
LORETTA MARK 23 1995 78 86 1 0 -2 9 -65 2 26 -56
MCCARTY DAVE 25 1995 271 254 0 1 -91 6 -83 -86 12 -70
MCCRACKEN QUINTON 25 1995 150 192 1 0 -54 31 11 -72 40 -8
MOUTON LYLE 26 1995 77 49 0 1 23 11 20 9 23 8
MUELLER BILL 24 1995 125 135 1 0 59 6 1 57 9 12
NEWFIELD MARC 22 1995 30 108 1 0 0 -9 21 22 10 54
OBANDO SHERMAN 25 1995 104 109 1 0 -22 33 27 -20 50 19
OCHOA ALEX 23 1995 113 127 1 0 42 -16 13 40 -10 37
ORDONEZ REY 23 1995 160 139 0 1 61 -10 -28 57 -9 -16
OTERO RICKY 23 1995 48 71 1 0 19 -3 7 23 2 23
OWENS JAYHAWK 26 1995 202 335 1 0 -37 52 -76 -55 86 -139
OWENS ERIC 24 1995 299 289 0 1 -89 0 -121 -85 4 -115
PEREZ ROBERT 26 1995 60 88 1 0 3 5 -49 6 17 -59
PEREZ EDDIE 27 1995 88 103 1 0 -6 16 60 12 24 55
RANDA JOE 25 1995 143 130 0 1 58 -17 10 57 2 14
RENTERIA EDGAR 19 1995 120 135 1 0 45 28 2 44 34 13
RODRIGUEZ ALEX 19 1995 23 90 1 0 -3 11 6 20 19 31
SANTANGELO F.P. 27 1995 125 159 1 0 36 13 40 45 24 45
SWEENEY MARK 25 1995 205 181 0 1 -72 15 -46 -54 43 -30
TUCKER MIKE 24 1995 121 163 1 0 -13 21 74 -19 49 76
VITIELLO JOE 25 1995 109 121 1 0 -11 71 -16 -6 108 1
YOUNG ERNIE 25 1995 53 32 0 1 -7 -16 23 5 1 21
TOTAL 30 16
DT MLE DT MLE DT Deltas MLE Deltas
Last First Age Yr Net Right BA WR ISO BA WR ISO
ABREU BOB 22 1996 66 29 0 1 8 -31 -19 1 -23 4
ALLENSWORTH JERMAINE 24 1996 128 152 1 0 -32 19 -45 -41 32 -38
AMARO RUBEN 31 1996 36 72 1 0 -3 19 11 -6 43 17
BARRON TONY 29 1996 17 37 1 0 -1 10 -5 3 16 -15
BELLHORN MARK 21 1996 59 47 0 1 -8 14 29 -2 28 15
BENITEZ YAMIL 23 1996 35 51 1 0 -11 -10 3 20 0 11
BREDE BRENT 24 1996 101 169 1 0 -26 -35 -14 -44 -42 -39
BRITO TILSON 24 1996 106 125 1 0 -16 -27 -47 -17 -20 -71
CAMERON MIKE 23 1996 113 177 1 0 -26 27 -34 -28 28 -93
CASTILLO LUIS 20 1996 103 131 1 0 -26 -21 -30 -48 -6 -29
CRUZ JOSE 22 1996 158 165 1 0 -9 -34 106 -15 -52 83
CUMMINGS MIDRE 24 1996 88 114 1 0 4 47 33 -9 57 39
DIFELICE MIKE 27 1996 82 91 1 0 -21 -3 -37 -19 6 -47
ERSTAD DARIN 22 1996 65 95 1 0 -6 -13 40 12 -21 50
FRANCO MATT 26 1996 69 86 1 0 -16 10 27 -29 16 12
GIL BENJI 23 1996 43 41 0 1 11 -16 5 12 -17 0
GILES BRIAN 25 1996 107 168 1 0 -30 36 -11 -31 46 -60
GLANVILLE DOUG 25 1996 48 47 0 1 9 16 14 9 22 7
GOODWIN CURTIS 23 1996 89 96 1 0 -4 -48 -33 12 -49 -23
GRAFFANINO TONY 24 1996 111 115 1 0 0 43 68 -2 58 53
GUERRERO VLADIMIR 20 1996 121 102 0 1 -34 -36 -17 -27 -30 -18
GUERRERO WILTON 21 1996 98 84 0 1 20 -27 31 8 -20 48
HATTEBERG SCOTT 26 1996 163 98 0 1 38 -46 41 16 -60 6
LAWTON MATT 24 1996 100 131 1 0 -7 43 43 -22 62 25
MCDONALD JASON 24 1996 211 203 0 1 54 32 71 48 41 66
MCGUIRE RYAN 24 1996 79 90 1 0 12 -19 36 23 -13 31
MUELLER BILL 25 1996 196 155 0 1 44 30 78 21 39 74
NEVIN PHIL 25 1996 99 219 1 0 -31 -18 19 -48 -42 -81
NUNNALLY JON 24 1996 128 160 1 0 39 11 39 56 23 25
OCHOA ALEX 24 1996 211 243 1 0 -62 -37 -50 -75 -48 -45
ORIE KEVIN 23 1996 65 39 0 1 -14 -12 25 -3 -15 18
PEREZ EDDIE 26 1996 102 63 0 1 -29 -3 41 -15 -4 29
PEREZ NEIFI 21 1996 77 108 1 0 -1 32 43 -33 38 4
POLCOVICH KEVIN 26 1996 175 161 0 1 41 37 56 28 47 58
POSADA JORGE 24 1996 92 29 0 1 18 -26 30 0 -29 0
REESE POKEY 23 1996 21 43 1 0 -5 3 8 9 12 13
ROLEN SCOTT 21 1996 52 74 1 0 -7 1 37 -9 25 31
SPIEZIO SCOTT 23 1996 16 23 1 0 1 -10 -4 6 -6 -5
STEVENS LEE 29 1996 98 204 1 0 -11 -60 -16 -17 -74 -96
STEWART SHANNON 22 1996 61 71 1 0 1 -8 51 13 4 41
STYNES CHRIS 23 1996 58 69 1 0 8 -6 -36 23 5 -18
SVEUM DALE 32 1996 43 80 1 0 -5 7 -26 -7 15 -51
SWEENEY MIKE 22 1996 160 102 0 1 -38 -25 -59 -21 -5 -55
VIDRO JOSE 21 1996 62 82 1 0 7 16 -32 14 17 -37
VOIGT JACK 30 1996 192 186 0 1 -32 -23 105 -39 -41 67
WALKER TODD 23 1996 245 271 1 0 -67 -21 -90 -74 -16 -107
WIDGER CHRIS 25 1996 139 129 0 1 -50 12 27 -60 7 2
WOMACK TONY 26 1996 100 73 0 1 26 13 35 11 19 32
YOUNG DMITRI 22 1996 221 207 0 1 -61 30 -69 -48 53 -58
TOTAL 31 18