CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
No Previous Article
No Next Article

October 9, 2003

Getting PADE

Improving on Defensive Efficiency

by James Click

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

Evaluating defense has always been one of the more difficult tasks for performance analysts. The first reason for this is that looks can be deceiving. Sure, that acrobatic shortstop playing in the country's largest market might appear to be a superior defender to the untrained eye, but all too often we draw our conclusions by putting emphasis on the outcome rather than the process of fielding the ball, itself. The second reason is the still-severe limitations we face with regard to collecting data, and how to properly interpret that data once we get a meaningful amount of it. Granted, there are some statistics that can be used when evaluating defense--errors, fielding percentage, Range Factor, Zone Rating, etc.--but none of them is without its flaws.

Which bring us to one of Bill James' measures for quantifying defensive performance: Defensive Efficiency (provided here by Keith Woolner). Defensive Efficiency is a metric that measures a team's ability to turn balls-in-play into outs, using the formula...

    (TotalOuts - Strikeouts)/(BIP-HR)

Despite being raw and only applying to entire teams, Defensive Efficiency is a fair measure of overall defensive performance. But that doesn't mean it can't be improved.

Defense can be broken down into several facets, primarily pitching, ballpark, and actual defensive performance. While we've conceded that pitching and defense are extremely difficult to separate, it's much easier to take into account the venue in which the game took place. For example, looking at this season's numbers, Defensive Efficiency rates the Rockies as one of the league's worst defenses, while the Dodgers have one of the best. But how much of that is actual performance, and how much of that is simply a function of each team's playing environment? Can we determine how the Rockies would perform if they played someplace else?

We already use park factors when adjusting hitting and pitching statistics, and they can be applied to defense as well. However, using established park factors to adjust our defensive statistics would yield skewed results, as they take into account the full slate of offensive statistics, most notably home runs. Smaller ballparks are the main concern, as they yield a higher park factor, mostly thanks to home runs. But the fact of the matter is that many small ballparks might actually be easier to play defense in, since their outfield is much smaller.

Since we can't use established park factors, the first step has to be to establish a defensive baseline for each park in the majors. There are several ways to do this. One would be to generate a Def_Eff number for each park in the majors, using James' formula but applying it to parks instead of teams, and using statistics over a wider range of time (say, three-to-five years). However, doing so would still allow various defenses too much input on the park factors. For instance, how would Turner Field's park factor look if Andruw Jones hadn't been patrolling center field the entire time? Or Torii Hunter in the Metrodome and Ichiro Suzuki and Mike Cameron in Safeco? Even when allowing for visiting teams' numbers, those star defensive players make up half of the available statistics for a par,k and could skew the numbers.

Instead, at Keith's suggestion, we'll use the ratio of each team's home Def_Eff to it's away Def_Eff, using numbers for the last three years where applicable. There are some adjustments to be made, particularly in Cincinnati, where the Great American Ballpark has only been open for one season, and in Puerto Rico, where there were a mere 22 games played in Hiram Bithorn Stadium. Though a small sample size flag should go up for both of those, it shouldn't increase the amount of statistical noise (the nice way of saying "error") in our numbers. The results are below, complete with Clay Davenport's full park factors for 2001-03. Note that with DE Park Factor, the lower the number, the harder it is to play defense in a particular park; whereas with Clay's Full Park Factor, the higher numbers indicate advantages to the hitters.


                          DE Park   Full Park
Park                       Factor     Factor
---------------------------------------------
Coors Field                0.9544      1112
Kauffman Stadium           0.9773      1100
Fenway Park                0.9779      1010
Ballpark at Arlington      0.9779      1053
Bank One Ballpark          0.9782      1060
Metrodome                  0.9875      1009
Minute Maid Park           0.9880      1038
Olympic Stadium            0.9898      1067
Tropicana Field            0.9899       997
Sky Dome                   0.9915      1034
Edison Field               0.9953       987
PNC Park                   0.9955      1009
Pac Bell Park              0.9984       942
Jacobs Field               1.0021       997
Pro Player Stadium         1.0027       955
Busch Stadium              1.0028       974
Turner Field               1.0038       986
Hiram Bithorn Stadium      1.0092      1120
Comerica Park              1.0093       966
Great American Ballpark    1.0097       998
Wrigley Field              1.0098       976
Shea Stadium               1.0104       950
New Comiskey Park          1.0113      1018
Miller Park                1.0127       995
Network Associates Col.    1.0161      1003
Qualcomm Stadium           1.0169       918
Yankee Stadium             1.0173       976
Camden Yards               1.0206       959
Veterans Stadium           1.0216       939
SafeCo Field               1.0218       949
Dodger Stadium             1.0294       917

There are a few notable parks that move up or down the list. Pac Bell Park, normally one of the game's most pitcher-friendly parks, is actually slightly tougher-than-average on defenses, likely owing much to its expansive outfield. Pro Player Stadium falls into this category as well. Fenway Park, despite having a park factor almost identical to Network Associates Coliseum, is actually one of the most difficult parks on defenders--but for slightly different reasons than Pac Bell. While the Coliseum is a symmetrical ballpark, Fenway's nooks, crannies, and monsters turn many routine fly balls into singles and doubles (or for Bucky Dent, home runs). For the most part, the results hold true to our previous thinking--Dodger Stadium is up at the top, while Coors is way off the bottom--but the adjustments will make our measurements more accurate.

Now that we've established a general idea of how difficult it is to play defense in each park, we can see how teams perform against that average. To do so, we need to establish a team baseline for the season. By multiplying the number of games a team plays in each park by that park's defensive average and then dividing by the total number of games played, we can establish a baseline for how difficult a team's schedule was on the defense. To save some space, we won't include those numbers here, but if you really want them, let me know.

Then, quite simply, we divide James' raw Defensive Efficiency for each team, re-centered around the league average, by each team's schedule-adjusted Defensive Efficiency. This calculation yields a percentage that gives us an idea of how each team's defense performed against the expected league average, given their schedule. We'll clumsily call this metric PADE--Park Adjusted Defensive Efficiency--and will now open the floor for suggestions for a new acronym. Here are the results for 2003:


Team                             PADE
-------------------------------------
Tampa Bay Devil Rays            2.141
Seattle Mariners                1.887
Houston Astros                  1.831
San Francisco Giants            1.774
Oakland Athletics               1.522
Anaheim Angels                  1.067
Cleveland Indians               1.020
Chicago White Sox               0.756
Arizona Diamondbacks            0.752
Minnesota Twins                 0.570
Atlanta Braves                  0.552
Kansas City Royals              0.512
Los Angeles Dodgers             0.148
St Louis Cardinals              0.064
Montreal Expos                  0.049
Pittsburgh Pirates             -0.081
Colorado Rockies               -0.171
Philadelphia Phillies          -0.324
Boston Red Sox                 -0.390
San Diego Padres               -0.517
Chicago Cubs                   -0.679
Detroit Tigers                 -1.061
Florida Marlins                -1.116
Cincinnati Reds                -1.162
Toronto Blue Jays              -1.208
New York Mets                  -1.226
Milwaukee Brewers              -2.141
Texas Rangers                  -2.332
New York Yankees               -2.497
Baltimore Orioles              -2.690

A league average defense will yield a rating of 0.000. A team with a PADE of 1.000 turns 1% more BIP into outs than an average team in their schedule--not an insignificant amount.

There aren't many teams that suddenly appear to be significantly better or worse defensively than with James' original metric. However, there are a few interesting moves to note.

  • Colorado, originally ranked second-from-the-bottom, appears to be much closer to league average.

  • The Rangers, Yankees, and Orioles don't get any help, remaining cellar-dwellers.

  • The Dodgers drop significantly down the list, moving from the fifth-best team to only slightly above average--obviously due to playing so many games in extreme pitchers' parks like Dodger Stadium and Pac Bell Park.

  • Those kids in Tampa can play, jumping ahead of the Mariners and A's for the top spot in this year's list.

So what is PADE good for? Primarily, it's a purer metric of a total team' s defensive performance that does not punish or reward teams for playing in certain parks. Using PADE does not account for how individual players affect a team's defense, but it can, among other things, give a better estimate of which pitchers benefited the most from nifty glovework, yielding more accurate appraisals of individual pitching performances. There are still many improvements that can be made to our defensive statistics, but removing park factors is a good first step.

2 comments have been left for this article.

No Previous Article
No Next Article

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM OCTOBER 9, 2003
From The Mailbag: Internet Baseball Awards E...
Premium Article Prospectus Today: Both Barrels Blazing
Prospectus Q&A: Kevin Towers, Part I

MORE BY JAMES CLICK
2003-12-12 - Live from the (Mock) Winter Meetings
2003-10-27 - Premium Article A TAD Here or There
2003-10-22 - Premium Article Getting PADE, Redux
2003-10-09 - Premium Article Getting PADE
2003-09-19 - Checks and Balances
More...