BP Comment Quick Links
April 3, 2008 Schrodinger's BatReminiscing with SFR, the Sequel
"Aw, how could he lose the ball in the sun, he's from Mexico." Baseball season is finally underway, and that means multiple games every day to choose from, along with a neverending supply of story lines, intrigue, and just plain excitement from now to October. But at the same time, our quest for enlightenment never ceases, and so this week we'll take a second look at applying Simple Fielding Runs (SFR) to historical play-by-play data. Regular readers will recall that last week we applied the algorithms for Simple Fielding Runs (at least for infielders) to the 1988 through 1998 seasons, and spent a few enjoyable moments thinking about the accomplishments of the defenders of yesteryear. I noted that SFR could be run in its original form against that particular data set because the two primary pieces of information that are required--fielder position and hit type (line drive, groundball, fly ball, popup)--are relatively intact and found in the same ratios as in the 2003 through 2007 data, as well as the minor league data to which SFR has been applied. Sadly, that is not the case for earlier seasons where hit type information is notably in shorter supply. However, the record of the fielder who fielded the ball is essentially complete for the vast majority of the seasons stretching back to 1957. So, just as the scientists at Jurassic Park filled in the missing dinosaur genetic code with that of a frog, we can adjust the SFR algorithm to do likewise when hit type data is absent. Let's just hope our results are a little more positive. To be more precise, two main adjustments were made to SFR this week.
Before getting into the results, it should be mentioned that I ran the new version for the data from 1957 through 1983. It turns out that 1984 and 1985, like 2000-2002, are missing significantly more fielder designations, while 1986 and 1987 are somewhat better. I simply didn't feel comfortable reporting results for those years without looking closer at how to account for the missing data. So what follows are the results for just that 27-year period from 1957 through 1983. For this reason, while this system appears to work, as you'll see below, suffice it to say that as with most software projects this one remains a work in progress and will likely undergo additional changes in the future. As we did last week, let's start our look in Table 1 with the overall SFR leader from each season (for a single team), regardless of position. Table 1. SFR Leaders by Position and Team 1957-1983 Year Player Age Pos Balls Runners Diff SFR Rate 1957 Roy McMillan 27 Short 608 124 28 21.0 1.22 1958 Johnny Logan 31 Short 623 117 38 28.9 1.33 1959 Ernie Banks 28 Short 709 135 38 28.9 1.28 1960 Ernie Banks 29 Short 673 137 36 27.0 1.26 1961 Roy McMillan 31 Short 669 131 37 27.8 1.28 1962 Brooks Robinson 25 Third 542 72 30 23.3 1.41 1963 Dick Groat 32 Short 619 62 68 51.3 2.11 1964 Eddie Kasko 32 Short 557 103 31 23.6 1.30 1965 Ron Hansen 27 Short 721 102 41 31.2 1.41 1966 Dal Maxvill 27 Short 591 108 30 22.8 1.28 1967 Brooks Robinson 30 Third 578 65 41 32.5 1.64 1968 Luis Aparicio 34 Short 734 119 49 36.7 1.41 1969 Brooks Robinson 32 Third 570 73 33 25.7 1.45 1970 Bobby Wine 31 Short 656 120 32 23.9 1.26 1971 Tommy Helms 30 Second 584 89 33 25.0 1.37 1972 Bert Campaneris 30 Short 716 122 33 24.7 1.27 1973 Bert Campaneris 31 Short 710 129 35 26.4 1.27 1974 Tim Foli 23 Short 556 97 40 29.6 1.41 1975 Rick Burleson 24 Short 644 74 65 48.8 1.88 1976 Mark Belanger 32 Short 787 161 27 20.0 1.17 1977 Mark Belanger 33 Short 608 119 25 18.7 1.21 1978 Ozzie Smith 23 Short 788 155 30 22.8 1.20 1979 Ozzie Smith 24 Short 792 159 30 22.9 1.19 1980 Bucky Dent 28 Short 699 149 27 20.2 1.18 1981 Buddy Bell 29 Third 389 58 26 20.2 1.44 1982 Ozzie Smith 27 Short 746 155 22 16.7 1.14 1983 Ryne Sandberg 23 Second 704 104 54 40.4 1.52 Passing the smell test is always the first hurdle to get over when looking at a new metric, and by a reasonable assessment of history the list in Table 1 passes that test. Quite fittingly, Ozzie Smith and Brooks Robinson led the majors three times apiece, with Roy McMillan, Mark Belanger, and Bert Campaneris doing so twice. In this list we also find Rick Burleson, Luis Aparicio, Ryne Sandberg, Buddy Bell, and Dick Groat, all of whom were known for their defensive prowess. Here we also see that Ernie Banks took the crown twice, and it should be noted that SFR rates him highly in 1958 (+21) through 1961 (+9) at shortstop before his move to first in 1962. At first base he had one outstanding season in 1964 (+9) and one poor one in 1965 (-7), but he otherwise rated as average defensively. As we did last week, we'll now examine each of the four infield positions in a little more detail. Shortstops Last week, we introduced a rate statistic that is simply calculated as the ratio of expected to actual baserunners for the chosen time period. Once again we'll use that rate statistic to develop the leaderboards with the difference that since we're looking at a longer time period we'll include only those players who were assigned 1,500 or more balls to field. Table 2 shows the top and bottom shortstops of the period: Table 2. Top and Bottom Shortstops by Rate, >= 1,500 Balls 1957-1983 Player Span Balls Runners Diff SFR Rate Bob Lillis 1958-1967 2053 373 74 55.8 1.25 Rick Burleson 1974-1983 5296 1065 167 125.9 1.20 Ernie Banks 1957-1961 2916 619 115 87.4 1.19 Mark Belanger 1965-1982 8468 1704 262 198.3 1.16 Eddie Kasko 1957-1966 2195 466 50 38.3 1.15 Hal Lanier 1964-1973 3056 627 90 68.1 1.15 Roger Metzger 1970-1980 5059 1032 138 103.7 1.15 Dick Groat 1957-1967 6739 1488 109 83.9 1.14 Ozzie Smith 1978-1983 4586 977 133 100.1 1.14 Ron Hansen 1958-1972 5099 1051 114 87.0 1.13 ---------------------------------------------------------------------- Dick McAuliffe 1960-1974 2784 700 -74 -54.8 0.92 Mario Guerrero 1973-1980 2304 594 -49 -36.2 0.92 Toby Harrah 1969-1982 3649 927 -80 -59.3 0.92 Zoilo Versalles 1959-1971 5759 1436 -127 -93.6 0.91 Chico Fernandez 1957-1963 3535 927 -94 -68.5 0.91 Gene Michael 1966-1975 3953 1046 -105 -77.8 0.91 Mike Tyson 1972-1975 1623 414 -40 -29.7 0.90 Frank Taveras 1972-1982 4903 1299 -144 -106.7 0.89 Ruben Amaro 1958-1969 2909 757 -91 -66.3 0.88 Roberto Pena 1965-1971 1673 471 -75 -55.8 0.88 Somewhat surprisingly (to me anyway) Bob Lillis--an original member of the 1962 Houston Colt .45's and later coach and manager for the Astros--finds himself at the top of the list in terms of Rate on the strength of over five seasons as the regular shortstop. What's less surprising is that Belanger comes out on top in total SFR at +198 runs, leaving Campaneris well behind at +143 runs (13th and not shown in Table 2, with a rate of 1.12). It should be noted that Sean Smith, in discussing his very similar TotalZone system has Belanger at +232 runs. Belanger's career breakdown is shown in Table 3: Table 3. Mark Belanger Year Pos Balls Runners Diff SFR Rate 1965 Short 2 0 0 0.3 4.08 1966 Short 27 3 4 2.6 2.23 1967 Second 57 8 3 2.3 1.39 Short 162 38 -4 -3.0 0.89 1968 Short 643 112 31 23.6 1.28 1969 Short 664 135 21 15.8 1.15 1970 Short 616 129 12 9.5 1.10 1971 Short 721 143 17 12.8 1.12 1972 Short 437 83 9 6.7 1.11 1973 Short 745 149 24 18.4 1.16 1974 Short 790 164 30 22.2 1.18 1975 Short 696 120 34 25.6 1.28 1976 Short 787 161 27 20.0 1.17 1977 Short 608 119 25 18.7 1.21 1978 Short 556 116 20 14.8 1.17 1979 Short 281 62 4 2.7 1.06 1980 Short 391 86 8 5.9 1.09 1981 Short 254 60 2 1.8 1.04 1982 Short 88 23 0 -0.1 0.99 8525 1712 265 200.6 1.14 Other notable names that don't make the top ten include Alan Trammell at number 26 (+43/1.08), Larry Bowa 39th (+28/1.04), Luis Aparicio 42nd (+47/1.04), and Dave Concepcion 43rd (+41/1.03). The epitome of an average shortstop during this period would be my boyhood hero, Ivan DeJesus, whose total SFR was +0.36 with a Rate of 1.01. On the bottom we find Roberto Pena, who played primarily for the Phillies, Padres, and Brewers in the late '60s; he was no better offensively, putting up a career .245/.290/.310 line. Pena also has the distinction of recording the lowest SFR total for a season (-54 in 1968), a season in which he committed 32 errors in 133 games at shortstop. Although Frank Taveras stands out among the bottom ten with his -107 runs and rate of 0.88, Don Kessinger actually recorded the lowest overall total of -116 by recording negative totals in 11 of 17 seasons at shortstop, and three times (1966, 1974, 1975) finding himself at less than -20 runs. Before leaving shortstops, we should also take a look at the first few years of Ozzie Smith's career: Table 4. Ozzie Smith Year Pos Balls Runners Diff SFR Rate 1978 Short 788 155 30 22.8 1.20 1979 Short 792 159 30 22.9 1.19 1980 Short 905 206 18 13.8 1.09 1981 Short 592 142 8 6.2 1.06 1982 Short 746 155 22 16.7 1.14 1983 Short 764 160 24 17.7 1.15 4586 977 133 100 1.14 As mentioned last week, Smith recorded some impressive SFR totals in his age-33 through age-35 seasons, and now we see that in his age-23 through age-28 seasons that he was no slouch then either. Taken together, those two spans approach 200 runs, and when the mid-1980s are eventually added, there's little doubt that he'll surpass Belanger and challenge Cal Ripken for the top spot in total SFR. Second Base Moving on, Table 5 lists the top and bottom second basemen in terms of Rate, once again looking only at those fielders who've been assigned 1,500 or more balls. Table 5. Top and Bottom Second basemen by Rate, >= 1,500 Balls 1957-1983 Player Span Balls Runners Diff SFR Rate Dick Green 1963-1974 4281 743 135 102.0 1.20 Rob Wilfong 1977-1983 2156 438 24 18.0 1.19 Lou Whitaker 1977-1983 3482 654 79 59.5 1.12 Nellie Fox 1957-1965 4921 939 84 64.9 1.11 Jim Gilliam 1957-1965 1855 366 31 23.6 1.11 Bill Mazeroski 1957-1972 8050 1603 165 127.5 1.11 Tim Cullen 1966-1972 1542 305 25 19.0 1.11 Tom Herr 1979-1983 1584 313 31 23.4 1.10 Glenn Hubbard 1978-1983 2843 567 51 39.0 1.10 Charlie Neal 1957-1962 2146 435 37 28.2 1.09 ---------------------------------------------------------------------- Chuck Hiller 1961-1968 2063 450 -32 -22.9 0.94 Duane Kuiper 1974-1983 3562 812 -70 -51.5 0.94 Mike Andrews 1966-1973 2983 674 -58 -42.1 0.91 Pete Rose 1963-1969 2440 498 -46 -33.0 0.91 Len Randle 1971-1982 1810 428 -47 -34.8 0.90 Bobby Richardson 1957-1966 4976 1196 -150 -109.9 0.88 Tony Taylor 1958-1976 5735 1443 -208 -152.6 0.88 Jake Wood 1961-1967 1577 356 -48 -35.3 0.87 Cookie Rojas 1962-1977 5594 1402 -214 -157.3 0.87 Jorge Orta 1972-1979 2567 644 -107 -79.4 0.84 Dick Green and Rob Wilfong take the top two spots, while the rest of the top ten is littered with good fielders including Bill Mazeroksi, who recorded the highest total at second base at +127 runs, and whose career is shown in Table 6. It's also interesting to note that Fielding Runs (FR) superstar Glen Hubbard ranks ninth but doesn't record a total SFR value close to the +63 FR he shows in the latest Baseball Encyclopedia. His biggest years in FR, unfortunately, come in 1985-87, when we don't SFR data yet, seasons in which the Encyclopedia has him at +60, +40, and +28. Other names that didn't make top ten include Joe Morgan at number 20 (+71/1.06), Bobby Grich 26th (+32/1.05), Frank White 29th (+25/1.04, who is another favorite of TotalZone), and Willie Randolph 41st (+11/1.02). Table 6. Bill Mazeroski Year Pos Balls Runners Diff SFR Rate 1957 Second 525 106 12 9.4 1.12 1958 Second 616 110 31 23.7 1.28 1959 Second 517 118 -3 -1.6 0.98 1960 Second 598 125 6 4.9 1.05 1961 Second 647 132 11 8.8 1.09 1962 Second 649 133 3 2.4 1.02 1963 Second 612 94 35 26.1 1.37 1964 Second 714 149 19 14.6 1.13 1965 Second 507 85 13 10.1 1.15 1966 Second 617 123 9 6.9 1.07 1967 Second 641 134 2 1.4 1.01 1968 Second 558 116 8 6.0 1.07 1969 Second 204 43 4 3.3 1.10 1970 Second 412 81 15 11.2 1.18 1971 Second 176 37 2 1.8 1.06 Third 10 3 -1 -1.0 0.57 1972 Second 57 16 -2 -1.6 0.86 Third 5 1 0 -0.1 0.88 8066 1608 164 126 1.11 Apparently Jorge Orta couldn't always blame the sun, as he took the bottom spot with a rate of 0.84 and -79 runs in eight seasons at the keystone. Cookie Rojas, however, outdid him in total SFR at -157 runs, just edging out Tony Taylor. As with Ozzie Smith, it's interesting to look at the early career of Lou Whitaker. Last week we said that Whitaker and Ryne Sandberg were neck and neck, with Whitaker edging Sandberg in Rate but Sandberg playing significantly more, resulting in 17 additional runs saved (80 to Whitaker's 63). Table 7 shows the early part of Whitaker's career, when his rate was excellent, as he contributed another 59 runs: Table 7. Lou Whitaker Year Pos Balls Runners Diff SFR Rate 1977 Second 33 9 -2 -1.8 0.74 1978 Second 618 114 14 10.8 1.13 1979 Second 487 93 10 7.5 1.11 1980 Second 627 121 10 7.4 1.08 1981 Second 490 92 14 10.9 1.16 1982 Second 612 110 18 13.3 1.16 1983 Second 614 115 15 11.3 1.13 3482 654 79 59 1.12 However, Sandberg's +40 in 1983 was the second-highest single season total to Dick Groat's +51 in 1963. In addition, Sandberg was at +9 at second in limited time in 1982 and also +10 at third base in his rookie campaign that same year to add another 59 runs to his total during this time span. Third Base As we move to the hot corner, consider Table 8, which lists the top and bottom third basemen, once again in terms of Rate: Table 8. Top and Bottom Third basemen by Rate, >= 1,500 Balls 1957-1983 Player Span Balls Runners Diff SFR Rate Brooks Robinson 1957-1977 9686 1404 372 293.0 1.29 Jim Davenport 1958-1970 2917 444 86 68.3 1.21 Aurelio Rodriguez 1967-1983 6266 1045 170 133.7 1.20 Ed Charles 1962-1969 2934 450 66 52.6 1.18 Buddy Bell 1972-1983 5514 925 143 113.2 1.16 Pete Ward 1963-1969 1819 298 39 31.0 1.16 Rico Petrocelli 1966-1976 2278 398 30 24.5 1.16 Bob Aspromonte 1960-1971 3129 485 53 42.4 1.15 Sal Bando 1966-1981 5959 967 125 99.1 1.15 Don Money 1970-1983 3286 570 46 36.9 1.12 ---------------------------------------------------------------------- Wayne Gross 1977-1983 1993 403 -25 -18.9 0.95 Charley Smith 1960-1968 2097 439 -41 -30.5 0.94 Eddie Yost 1957-1962 1903 389 -32 -24.0 0.94 Tony Perez 1967-1971 2487 519 -42 -31.5 0.93 Butch Hobson 1976-1981 2041 432 -38 -29.6 0.92 Pete Rose 1966-1979 1826 389 -42 -32.4 0.90 Bill Madlock 1973-1983 3186 748 -95 -73.2 0.88 Carney Lansford 1978-1983 2232 518 -74 -57.1 0.86 Harmon Killebrew 1957-1971 2256 533 -98 -74.9 0.82 Dick Allen 1964-1972 2252 591 -154 -118.4 0.75 Brooks Robinson is way out on top with a rate eight percent higher than Jim Davenport and a total SFR of +293 runs (TotalZone has him at +269), almost 100 runs better than Belanger, and 180 runs better than his nearest third base competitor in Aurelio Rodriguez. His career totals are shown in Table 9. Table 9. Brooks Robinson Year Pos Balls Runners Diff SFR Rate 1957 Third 115 21 1 0.9 1.05 1958 Second 8 3 0 -0.3 0.86 Third 487 88 3 2.3 1.03 1959 Second 0 0 0 -0.1 0.26 Third 314 53 6 4.6 1.11 1960 Second 1 0 0 0.0 0.86 Third 545 74 24 18.9 1.32 1961 Third 538 91 14 10.9 1.15 1962 Second 2 1 0 -0.3 0.52 Third 542 72 30 23.3 1.41 Short 1 0 0 -0.1 0.30 1963 Third 514 59 29 22.9 1.50 1964 Third 526 82 15 12.1 1.19 1965 Third 468 62 11 8.6 1.17 1966 Third 512 74 11 8.7 1.15 1967 Third 578 65 41 32.5 1.64 1968 Third 557 63 34 26.4 1.53 1969 Third 570 73 33 25.7 1.45 1970 Third 536 89 11 8.7 1.12 1971 Third 534 81 22 17.4 1.27 1972 Third 517 73 22 17.5 1.31 1973 Third 531 76 24 19.0 1.32 1974 Third 596 102 17 13.5 1.17 1975 Third 460 61 22 16.9 1.35 1976 Third 206 35 2 1.9 1.07 1977 Third 38 8 0 0.2 1.03 9699 1408 371 292 Much of the rest of the top 10 is populated with familiar names, while Jim Gilliam at number 14 (+26/1.11), Eddie Matthews 18th (+47/1.08), and Graig Nettles 26th (+33/1.05) all come out looking respectable; Ron Santo disappointingly finished a distant 40th, at -22/0.99. The Wikipedia entry for Dick Allen simply says "His uncertain and often disinterested defensive play led to his leading the league in errors four times--twice each at third and first base," which pretty much sums up what SFR thought of his effort. He took the bottom spot both in rate and total SFR at third base, finishing at -118/0.75. When you add his work at first base he ends up at an eye-popping -151 runs, worth 15 wins (FRAR has him at -98 for his career). The rest of the group isn't much better, and here we find Pete Rose making his second appearance in the bottom ten. What is somewhat remarkable, as shown in Table 10, is that Rose managed to score negatively at all three infield positions he played during this time span (he played left field primarily from 1967 through 1974) with the only exception being three balls cleanly fielded at first base in 1978. Table 10. Pete Rose Year Pos Balls Runners Diff SFR Rate 1963 Second 605 122 -15 -10.7 0.88 1964 Second 458 103 -5 -3.7 0.95 1965 Second 625 112 -4 -2.4 0.97 1966 Second 613 131 -19 -13.9 0.85 Third 59 12 -3 -2.2 0.77 1967 Second 135 27 -2 -1.5 0.93 1968 Second 2 1 -1 -0.4 0.49 1969 Second 1 1 -1 -0.5 0.26 1975 Third 397 92 -23 -18.0 0.75 1976 Third 486 101 -6 -4.4 0.94 1977 Third 430 85 0 -0.1 1.00 1978 First 3 0 0 0.3 2.67 Third 441 95 -9 -7.3 0.90 1979 First 307 38 -2 -2.1 0.94 Third 14 4 -1 -0.5 0.84 1980 First 291 43 -2 -1.9 0.94 1981 First 212 30 0 -0.2 0.99 1982 First 339 47 -4 -3.1 0.91 1983 First 201 30 -1 -0.9 0.96 5620 1076 -98 -73 First Base Finally, we'll wrap up with a look at first basemen, as shown in Table 11: Table 11. Top and Bottom First Basemen by Rate, >= 1,500 Balls 1957-1983 Player Span Balls Runners Diff SFR Rate Tommy McCraw 1963-1975 1601 166 46 35.1 1.37 Carl Yastrzemski 1968-1982 1767 184 47 36.1 1.36 Bill White 1958-1969 3006 330 38 28.5 1.22 Wes Parker 1964-1972 2237 233 34 23.1 1.18 Mike Jorgensen 1968-1983 1556 183 22 17.4 1.18 Nate Colbert 1968-1976 2004 256 31 20.2 1.17 Eddie Murray 1977-1983 1914 214 32 25.5 1.16 Orlando Cepeda 1958-1972 3893 468 38 24.5 1.15 Ron Fairly 1961-1978 2405 284 30 21.6 1.15 George Scott 1966-1979 4431 524 66 51.1 1.15 ---------------------------------------------------------------------- John Mayberry 1968-1982 3008 448 -31 -23.8 0.97 Lee May 1966-1982 3099 404 -21 -19.9 0.97 Willie McCovey 1959-1980 4115 562 -34 -29.1 0.97 Donn Clendenon 1962-1972 2584 376 -28 -23.1 0.95 Deron Johnson 1961-1976 1523 210 -24 -19.6 0.93 Willie Montanez 1970-1982 2539 374 -39 -29.7 0.91 Dick Allen 1969-1977 1550 237 -38 -31.6 0.88 Mike Epstein 1966-1974 1501 249 -39 -32.4 0.88 Willie Stargell 1963-1982 1605 246 -44 -34.2 0.85 Dick Stuart 1958-1969 2104 359 -76 -59.4 0.80 Here we find another surprise, in that Tommy McCraw takes the top spot with a rate of 1.37 in his 13 seasons (primarily time spent with the White Sox). He was certainly no great shakes with the bat, recording a lifetime .246/.309/.362 line, so one would certainly hope he was contributing with the glove. I was also surprised that Keith Hernandez wasn't on the list; in fact, Hernandez finished 19th at +18/1.02, whereas he led all first baseman at +110 in Sean Smith's metric (given that it also includes 1984 through 1990). Steve Garvey also ranks fairly poorly, coming in 21st (+16/1.05). However, both metrics liked George Scott, although SFR has him at +51 runs (and does cover his entire career) while TotalZone is at +85. Dick Stuart once said, "One night in Pittsburgh, thirty-thousand fans gave me a standing ovation when I caught a hot dog wrapper on the fly", apparently deserved his reputation as Dr. Strangeglove, since he's at the bottom both in terms of overall SFR at -59 runs and a rate of 0.80. He has plenty of company, though, notably Willie Stargell (-34/0.85) and Mike Epstein (-32/0.88), and also the ubiquitous Dick Allen (-32/0.88). The Journey Continues As in many of life's endeavors, our quest for knowledge about baseball never ceases. Refining and extending ideas like SFR is only very small part of that bigger picture. For those of us who study the game, the sentiment reflected in this question and answer from Bill James in a recent New York Times interview is prescient: Q: Has sabermetrics pretty much squeezed the last drop of new insights out of traditional counting statistics? If so, what data ought to be collected to improve our understanding of the game? If not, where can the boundaries be pushed? 0 comments have been left for this article.
|