CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
<< Previous Article
Premium Article Transaction Action: Wa... (08/31)
<< Previous Column
Premium Article Prospectus Hit and Run... (08/28)
Next Column >>
Premium Article Prospectus Hit and Run... (09/02)
Next Article >>
Ahead in the Count: Ho... (09/01)

August 31, 2009

Prospectus Hit and Run

Building an MVP Predictor

by Jay Jaffe

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

As the recent scrum between supporters of the candidacies of Joe Mauer and Mark Teixeira reminds us, nearly every Most Valuable Player Award is capable of producing controversy. Not only do the voters from the Baseball Writers Association of America rarely elect the player who, via some objective formula, is worth the most wins to his team, they appear to shift their standards from year to year, instead constructing narratives to fit whatever loosely-gathered facts are at hand. Particularly in recent years, defensive value is often minimized or entirely ignored in favor of heavy hitters with big Triple Crown stats, almost invariably from successful teams.

The question is whether the voters' behavior can be predicted. To that end, I was tasked with building an MVP predictor in the spirit of a system such as Bill James' Hall of Fame Monitor, one that awards points for various levels of achievement in an attempt to identify who will win, as opposed to who should win. My initial bursts of enthusiasm for the assignment were soon followed by endless hours of cowering in the fetal position before a massive spreadsheet, but in the end I emerged with a system-Jaffe's Ugly MVP Predictor (JUMP)-which correctly identified 14 of the 28 winners during the Wild Card Era (1995 onward), and put 27 of those winners among the league's top three in its point totals.

I limited the scope of the system to that post-strike timeframe for three main reasons: none of the 28 winners were pitchers, only one played for a team that finished below .500 (Alex Rodriguez in 2003), and 22 of them played on teams that qualified for the expanded postseason-extremely strong tendencies that could help separate seemingly equal candidates. Instead of focusing on round-numbered benchmarks like James did (a .300 batting average, 100 RBI), I chose to dispense with actual stat totals and rates and focus on league rankings among batting title qualifiers (3.1 plate appearances per game) in 12 key offensive categories: batting average, on-base percentage, slugging percentage, OPS, hits, homers, total bases, runs, RBI, walks, intentional walks, and steals. Through much study, trial, and error-indeed, every single step of the process involved this-I eventually settled upon a 10-7-5-3-2-1-1-1-1-1 point system in each category, which produces a slight scoring bonus for leading the league or finishing in the top three, and some acknowledgement of a top-ten finish.

Surprisingly enough, it's not a strong showing in RBI or even home runs which is most common among the award winners of this era:


Category        Lead  Top 5  Top 10
Total Bases       7    19     25
Slugging Pct.    10    18     23
Runs              9    16     23
OPS               8    17     22
Batting Avg.      3    11     21
Home Runs         7    19     21
Runs Batted In    6    18     20
Intentional BB    7    13     20
On-Base Pct.      6    11     18
Hits              2    10     16
Bases on Balls    6     8     12
Stolen Bases      1     3      5

As you'll see below, the lack of a correlation between the getting-on-base stats and the eventual hardware had consequences that needed to be taken into account.

Because team performance has such an overwhelming effect on the voters' perceptions of players' candidacies, I recorded each team's record and route to the playoffs, fixing upon a system that awarded a maximum of three "Team Success Points": one for finishing at or above .500, another for winning the Wild Card, and two for winning the division. Those points were then multiplied by the team's win total and divided by nine; a player on a 99-win division winner thus received 33 points, one on a 90-win Wild Card team received 20 points, and one on an 81-win team received nine points. These team points, which can outweigh the points of any individual categories, do much to winnow the field.

At that point, various iterations of the system-some of which included weighting the stat categories according to the frequency with which past winners had placed in the top 10-correctly identified anywhere from nine to 12 winners out of 28, not a terribly impressive result. From there, JUMP became an exercise in careful gerrymandering, not only to increase the direct hits but to push as many winners as possible into the league's top three, a concession to the fact that at some point subjective elements take over for a number of voters. The point totals in a few categories-OBP, OPS, hits and walks-were dropped entirely from the scoring once it was determined that excluding them made no difference; simplicity was given the priority. Intentional walks were reduced to a 0.5 weight, stolen bases to 0.55. I introduced a positional adjustment, adding 3.33 points for middle infielders and penalizing 13 points for designated hitters, and an anti-Rockies adjustment, penalizing 10 points for high-altitude residence. All of these values were arrived at only after tedious trial and error.

Here's how the actual award winners fared in JUMP, along with the players it flagged as the likely winners in years where they differed from the voting:


Year   AL Winner          Rank    System Winner
1995   Mo Vaughn            3     Albert Belle
1996   Juan Gonzalez        2     Albert Belle
1997   Ken Griffey          1
1998   Juan Gonzalez        1
1999   Ivan Rodriguez      10     Manny Ramirez
2000   Jason Giambi         1
2001   Ichiro Suzuki        2     Bret Boone
2002   Miguel Tejada        2     Alfonso Soriano
2003   Alex Rodriguez       1
2004   Vladimir Guerrero    1
2005   Alex Rodriguez       1
2006   Justin Morneau       3     Derek Jeter
2007   Alex Rodriguez       1
2008   Dustin Pedroia       1

Year   NL Winner          Rank    System Winner
1995   Barry Larkin         3     Dante Bichette
1996   Ken Caminiti         1
1997   Larry Walker         2     Jeff Bagwell
1998   Sammy Sosa           1
1999   Chipper Jones        1
2000   Jeff Kent            3     Barry Bonds
2001   Barry Bonds          3     Sammy Sosa
2002   Barry Bonds          1
2003   Barry Bonds          1
2004   Barry Bonds          3     Albert Pujols
2005   Albert Pujols        1
2006   Ryan Howard          2     Albert Pujols
2007   Jimmy Rollins        3     Matt Holliday
2008   Albert Pujols        2     Ryan Howard

Memo to Mauer fans: the only catcher to win the award during this era is the one whose result sticks out like a sore thumb. In my gerrymandering efforts, no amount of positional bonus given to Pudge could offset the consequence on boosting Mike Piazza into the top three a few times or creating other problems. The system does correctly nail a few of the curveballs thrown by the voters, including Sosa over McGwire despite the latter's record-setting home run total in 1998 (one reason OBP was dropped), A-Rod with the 71-win Rangers in 2003, and Pedroia last year, and it gets pretty close on players like Larkin, Kent, Suzuki and Rollins, who won the award despite not finishing in the top five in homers, RBI, or slugging percentage-three of the four most common categories populated by MVP winners.

As for the various discrepancies, it doesn't take too much to recall some of the subjective elements which may have played a part in the voting, particularly in the AL. Take the writers' loathing of Belle, who in 1995 led the league in six point-accumulating categories and outdistanced Vaughn 84-47 here (Edgar Martinez was second at 52 points). Recall their fascination with the novelty of Ichiro and that 116-win, post Griffey/Rodriguez/Randy Johnson team. Note their tendency to avoid voting for Yankees when at all possible, particularly during the Torre dynasty years; Rodriguez's win in 2005 was the first for someone in pinstripes since Don Mattingly in 1985. Remember the way the wholesome Midwestern clutch goodness of Morneau's 130 RBI carried the day over a fine all-around year from Jeter (.343/.417/.483 with 118 runs and 97 RBI), not to mention the season turned in by teammate and batting-title winner Mauer.

Jeter's 2006 plight only serves to reopen the wounds of 1999, a monster year in which he hit .349/.438/.552 with 24 homers, 134 runs and 102 RBI, all career highs; while he ranks second in JUMP, he could do no better than sixth in the actual vote. Pedro Martinez, who won the Cy Young and the pitchers' Triple Crown by going 23-4 with a 2.07 ERA and 313 strikeouts, received the most first-place votes that year but wound up a tight second, the highest by a pitcher during this span. JUMP leader Ramirez wound up tied with teammate Roberto Alomar (who ranked third here) for a close third in the actual voting. Pudge overcame the strong candidacy of teammate Rafael Palmeiro, who ranked fourth here and finished fifth in the voting, leapfrogging a very strong, very unusual field.

There are actually more discrepancies here on the NL side, though the "wrong" winners are often players who wound up winning in other years-namely Bonds, Sosa, Pujols and Howard-softening the blows of those "injustices." Rollins won it as the sparkplug leadoff man who led the league in runs and finished second in total bases, and while the Phillies only made the playoffs on the regular season's final day, who knows how many already-sent votes might have turned Holliday's way given his Game 163 heroics.

As to what the system says about this year's MVP races, the names in the NL race are no surprise. Pujols, who leads the league in five of these categories and is in the top five in two others, is the overwhelming leader with 84 points, followed by Howard with 49 and Chase Utley with 40, repeating last year's third-place ranking. The system has flip-flopped on Pujols and Howard twice, getting the wrong answer both times, but neither time was the gap separating the two so wide as this year.

As for the AL, Teixeira leads in RBI and is second in homers, and ranks first with 55 points; he's followed by Miguel Cabrera with 47 points. Surprisingly, Chone Figgins is third with 44 points thanks to the league lead in runs and a third-place showing in steals; teammate Kendry Morales, who's second in the league in slugging, is just a fraction of a point behind him. As for Mauer, he's currently 28th, a consequence of playing for a team that through Saturday was a game under .500; Sunday's Twins win vaults him to 15th. While he leads the league in all three triple-slash categories, OBP doesn't score points here, and he currently cracks the top 10 only in intentional walks. He is 11th in homers and total bases, and 13th in RBI, so if the guy could just snap out of his August funk (.402/.462/.654 with seven homers and 22 RBI), he may yet JUMP into the fray.

A few years back, Rob Neyer and Bill James introduced a Cy Young Predictor formula in the Neyer/James Guide to Pitchers, a formula made possible by the relatively smaller number of statistical inputs which go into consideration for that award, and one that produced a much higher level of accuracy (around 80 percent) than JUMP does. In the end, a sharper mind than mine might well have produced an MVP prediction system with more direct hits, perhaps even a simpler, more elegant system altogether. Nonetheless, JUMP underscores both the wider variety of inputs that can come into play in a single MVP vote and the fact that nearly any given year produces at least a few candidates with strong enough statistical resumés and team backgrounds for a voter to attach to a narrative which rationalizes their vote. As with any season's actual voting results, I strongly suspect we haven't heard the last word on this topic.

A version of this story originally appeared on ESPN Insider Insider.

34 comments have been left for this article.

<< Previous Article
Premium Article Transaction Action: Wa... (08/31)
<< Previous Column
Premium Article Prospectus Hit and Run... (08/28)
Next Column >>
Premium Article Prospectus Hit and Run... (09/02)
Next Article >>
Ahead in the Count: Ho... (09/01)

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM AUGUST 31, 2009
Premium Article Transaction Action: Waivers Deadline Dealing
Premium Article Future Shock: Monday Ten Pack
The Week in Quotes: August 24-30

MORE BY JAY JAFFE
2009-09-15 - Premium Article Prospectus Hit and Run: Overachieving Yet Ag...
2009-09-04 - Premium Article Prospectus Hit List: Hit Parade
2009-09-02 - Premium Article Prospectus Hit and Run: Interleague Numerolo...
2009-08-31 - Premium Article Prospectus Hit and Run: Building an MVP Pred...
2009-08-28 - Prospectus Hit List: Breaking in and Breakin...
2009-08-28 - Premium Article Prospectus Hit and Run: Scheduling Impact in...
2009-08-27 - Premium Article Prospectus Hit and Run: Scheduling Impact in...
More...

MORE PROSPECTUS HIT AND RUN
2009-09-23 - Premium Article Prospectus Hit and Run: The Underachieving C...
2009-09-15 - Premium Article Prospectus Hit and Run: Overachieving Yet Ag...
2009-09-02 - Premium Article Prospectus Hit and Run: Interleague Numerolo...
2009-08-31 - Premium Article Prospectus Hit and Run: Building an MVP Pred...
2009-08-28 - Premium Article Prospectus Hit and Run: Scheduling Impact in...
2009-08-27 - Premium Article Prospectus Hit and Run: Scheduling Impact in...
2009-08-25 - Prospectus Hit and Run: The Hit List Remix o...
More...