BP Comment Quick Links
July 9, 2003 Lies, Damned LiesDigging in the Backyard
Baseball is the National Game, but at the amateur level, it's also a regional one. The frozen tundra of the Upper Midwest and the rolling hills of the Appalachians do not afford the same opportunity to play the sport year-round as the marshes of Florida, or the sun-drenched ballfields of California. Major league teams, which collectively are responsible for drafting nearly 1500 players every year--a far bigger burden than their counterparts in other sports face--are keenly aware of the differences. It simply isn't possible, or at least not economically feasible, to develop an accurate scouting report for every amateur prospect in the country. While the top national prospects will be scouted by everyone, teams go regional as the draft moves into its later rounds, focusing on players from their home territories (as the Braves do) or on players from regions in which the level of competition if perceived to be the highest--California, Florida, and the Southwest. Most of you, I suspect, are aware of these disparities. When a player from a cold-weather state is selected high--like Rocco Baldelli, pride of Woonsocket, Rhode Island, or Joe Mauer, pride of St. Paul, Minnesota--their hometown is mentioned early and often, precisely because such selections are unusual. But it's striking just how profound the differences are. A high school senior from Texas is three times more likely to be drafted into professional ball than a high school senior from New York. A high school senior from Arizona is six times more likely to be drafted than a high school senior from Minnesota. A high school senior from Florida is nine times more likely to be drafted than a high school senior from Illinois. Let's back up a second. We can come up with a pretty good estimate of the intensity with which a given state is scouted if we know two things: the number of players that are drafted, and the number of players that potentially could be. The first part is easy--we'll use data from the 2003 amateur draft, focusing on players selected out of high school only, based on their respective home states. College and Juco selections would confuse matters here--institutions of higher learning are not evenly distributed among the population, and the allotment of colleges with good baseball programs is even more skewed. Players flock to where the good programs are, not the other way around. There isn't, so far as I know, data on just how many high school baseball players there are in each state, but it's easy to come up with a reasonable proxy. The good folks at the Census Bureau have prepared a wealth of data organized by different demographic categories. One of these categories is age; the census estimated the number of 15-19 year olds in each state as of 2000 (you can find the data here, but be warned--the PDF file linked is massive). If we further assume that: i) people are evenly distributed within this age group--that is, there are as many 16-year-olds as 18-year olds, and ii) half of the people in this group are male (sorry, gals), we can come up with a good idea of the number of potential pro ballplayers there are in a given region. For example, according to the census, there are roughly 540,000 people between the ages of 15 and 19 in North Carolina as of April, 2000. We assume that one-fifth of these people (108,000) were high school seniors this year (insert dropout joke here), and that one-half of these (54,000) are male. Using these two numbers, we can come up with an estimate of the likelihood that a player from a given state is drafted. In the table below, I've created an index of the number of MLB draftees per 10,000 male 12th graders.
Male HS Draftees 2003 HS Seniors per 10,000 State Draftees (estimated) Eligibles Florida 82 101,407 8.1 Nevada 10 12,720 7.9 Puerto Rico 22 31,344 7.0 Arizona 18 36,772 4.9 Colorado 14 30,724 4.6 Washington 18 42,797 4.2 California 98 245,089 4.0 Alaska 2 5,009 4.0 Hawaii 3 8,100 3.7 Oklahoma 9 26,937 3.3 Louisiana 12 36,595 3.3 Georgia 19 59,628 3.2 Indiana 13 45,348 2.9 Texas 46 163,623 2.8 Alabama 9 32,458 2.8 North Carolina 14 53,993 2.6 Tennessee 10 39,518 2.5 Virginia 11 48,407 2.3 Missouri 9 41,330 2.2 Oregon 5 24,443 2.0 Massachusetts 8 41,574 1.9 North Dakota 1 5,362 1.9 Utah 4 21,628 1.8 Iowa 4 22,642 1.8 Maryland 5 35,612 1.4 Montana 1 7,131 1.4 Mississippi 3 23,319 1.3 Wisconsin 5 40,720 1.2 New Jersey 6 52,522 1.1 Arkansas 2 19,877 1.0 Ohio 8 81,687 1.0 Kansas 2 21,012 1.0 Pennsylvania 8 85,099 0.9 Idaho 1 11,086 0.9 Illinois 8 89,400 0.9 New York 11 128,754 0.9 Minnesota 3 37,436 0.8 West Virginia 1 12,558 0.8 Kentucky 2 28,900 0.7 New Mexico 1 14,575 0.7 Connecticut 1 21,663 0.5 Michigan 2 71,987 0.3 Washington DC 0 3,787 0.0 Wyoming 0 4,190 0.0 Vermont 0 4,577 0.0 Delaware 0 5,563 0.0 South Dakota 0 6,246 0.0 Rhode Island 0 7,545 0.0 New Hampshire 0 8,669 0.0 Maine 0 8,949 0.0 Nebraska 0 13,491 0.0 South Carolina 0 29,538 0.0 There's quite a range of values here, from 8.1 (Florida, though Nevada makes a surprisingly strong showing) to 0.0 (several states, the largest of which are Nebraska and South Carolina). The map below reflects the regional pattern that I described earlier:
The Sunbelt is heavily scouted, the Bible Belt less so, and the Rust Belt all but abandoned. Being from the Midwest, I'm sensitive about this. The South might have superior weather, lower cigarette taxes, and prettier women, but does it have better ballplayers, too? Before we attribute these differences to some sort of "bias," it's best to check up on what sort of return teams can expect to have on players from different regions. It would be dubious to claim that, say, Florida is overscouted if the concentration of good players there is commensurate with the state's heavy load of draft picks. One metric that should work pretty well is the geographic distribution of major league players. California, for example, is home to about 12% of the high school seniors in the United States; by comparison, it was responsible for about 19% of the high school players selected in the June draft. But that figure looks pretty reasonable when you compare it to the number of major league players who hail from California. Between 1993 and 2002, Californians accumulated approximately 25% of major league plate appearances, and 18% of major league innings pitched (I'm not counting players who hailed from outside the U.S. and Puerto Rico). California is scouted very heavily, but deservedly so. Florida, however, doesn't hold up so well. It accounted for 16% of this year's high school draftees, but only 8% of the major league PA in our sample, and under 5% of the major league IP. Sure, the state's population has grown a little bit in the past two decades, producing a lag effect that these numbers won't capture quite right, but the discrepancy here is huge; it seems likely that major league teams are getting a lower return on players from Florida than those from other parts of the country. Other states that come up as overscouted include Arizona, Colorado, and Texas, though the latter is something of an exceptional case. Turns out that there are a ton of pitchers from the Lone Star State--Texans accounted for 8.4% of the major league IP in our sample--but relatively few position players (just 2.9% of the sample). Whether that's due to the predominance of football in the state--if you can't play quarterback, pitch--the historical legacy of Nolan Ryan and Roger Clemens, or something else entirely, I don't know, but the split is dramatic. The complete set of data is provided below; I've listed a series of percentages indicating the following:
A "Delta" figure indicating the difference between MLB playing time (#5) and 2003 draft pick percentage (#2). Negative figures indicate that a state is overscouted, and positive figures that it's underscouted.
State Est HS 2003 MLB PA MLB IP population Draftees (93-02) (93-02) Average Delta Florida 4.9% 16.0% 8.2% 4.6% 6.4% -9.6% Texas 8.0% 9.0% 2.9% 8.4% 5.7% -3.3% Arizona 1.8% 3.5% 0.6% 0.8% 0.7% -2.9% Colorado 1.5% 2.7% 0.2% 0.7% 0.4% -2.3% Washington 2.1% 3.5% 1.6% 1.6% 1.6% -2.0% Nevada 0.6% 2.0% 0.3% 0.4% 0.3% -1.6% Tennessee 1.9% 2.0% 0.6% 1.0% 0.8% -1.2% North Carolina 2.6% 2.7% 1.8% 1.9% 1.8% -0.9% Utah 1.1% 0.8% 0.0% 0.2% 0.1% -0.6% Indiana 2.2% 2.5% 1.9% 1.9% 1.9% -0.6% Oklahoma 1.3% 1.8% 0.9% 1.5% 1.2% -0.6% Georgia 2.9% 3.7% 3.5% 3.1% 3.3% -0.4% Virginia 2.4% 2.2% 1.2% 2.3% 1.7% -0.4% Hawaii 0.4% 0.6% 0.1% 0.6% 0.3% -0.2% Alabama 1.6% 1.8% 1.4% 1.6% 1.5% -0.2% Idaho 0.5% 0.2% 0.0% 0.0% 0.0% -0.2% Montana 0.3% 0.2% 0.0% 0.1% 0.0% -0.2% New Mexico 0.7% 0.2% 0.0% 0.1% 0.1% -0.1% Arkansas 1.0% 0.4% 0.3% 0.3% 0.3% -0.1% Alaska 0.2% 0.4% 0.0% 0.7% 0.4% -0.0% Missouri 2.0% 1.8% 1.0% 2.5% 1.8% -0.0% Vermont 0.2% 0.0% 0.0% 0.0% 0.0% +0.0% Oregon 1.2% 1.0% 1.4% 0.7% 1.1% +0.1% Washington DC 0.2% 0.0% 0.1% 0.1% 0.1% +0.1% West Virginia 0.6% 0.2% 0.2% 0.4% 0.3% +0.1% South Dakota 0.3% 0.0% 0.0% 0.2% 0.1% +0.1% Maine 0.4% 0.0% 0.0% 0.2% 0.1% +0.1% Iowa 1.1% 0.8% 0.1% 1.7% 0.9% +0.1% North Dakota 0.3% 0.2% 0.3% 0.4% 0.3% +0.2% Rhode Island 0.4% 0.0% 0.2% 0.1% 0.2% +0.2% Wyoming 0.2% 0.0% 0.4% 0.0% 0.2% +0.2% Minnesota 1.8% 0.6% 0.9% 1.0% 0.9% +0.3% Wisconsin 2.0% 1.0% 1.0% 1.6% 1.3% +0.3% Nebraska 0.7% 0.0% 0.4% 0.3% 0.3% +0.3% New Hampshire 0.4% 0.0% 0.1% 0.6% 0.4% +0.4% Delaware 0.3% 0.0% 0.7% 0.0% 0.4% +0.4% Kansas 1.0% 0.4% 1.3% 0.5% 0.9% +0.6% Mississippi 1.1% 0.6% 1.6% 0.8% 1.2% +0.6% Louisiana 1.8% 2.3% 2.1% 3.9% 3.0% +0.7% Connecticut 1.1% 0.2% 1.1% 0.8% 1.0% +0.8% Maryland 1.7% 1.0% 1.7% 1.8% 1.8% +0.8% Puerto Rico 1.5% 4.3% 7.8% 2.4% 5.1% +0.8% Massachusetts 2.0% 1.6% 1.9% 3.0% 2.4% +0.9% South Carolina 1.4% 0.0% 1.2% 0.6% 0.9% +0.9% Kentucky 1.4% 0.4% 1.6% 1.2% 1.4% +1.0% New Jersey 2.6% 1.2% 2.5% 2.5% 2.5% +1.4% Pennsylvania 4.1% 1.6% 2.5% 4.8% 3.7% +2.1% Michigan 3.5% 0.4% 1.7% 3.4% 2.6% +2.2% California 11.9% 19.2% 24.9% 18.2% 21.5% +2.4% Ohio 4.0% 1.6% 4.2% 4.9% 4.6% +3.0% New York 6.3% 2.2% 6.4% 4.4% 5.4% +3.3% Illinois 4.4% 1.6% 4.9% 5.0% 4.9% +3.4% For the visual learners among us, the most overscouted and underscouted states are mapped out below:
Any resemblance to the early returns from the 2000 presidential election is unintentional. Although I think these data make a convincing case, there are a number of things to remember here:
All that said, I think there are opportunities here for an enterprising organization. Warmer weather is conducive to playing baseball instead of another sport or another activity, and players from those states are no doubt more advanced at the same age than their counterparts from colder climes. But the longer seasons that this weather provides for also presents more opportunity for major league teams to have seen a given player multiple times. Especially in the lower rounds of the draft, mere familiarity may trump other considerations. Increasing the intensity of the scouting effort during the shorter seasons in the Northeast and Midwest won't come without cost, but it's a strategy that might pay for itself and then some. Moreover, while there's no doubt that there are meaningful differences in the quality of high school competition between different regions, major league teams may exaggerate their importance in determining who is going to make the superior professional. Much as we emphasize drafting on results instead of potential when it comes to college players, the case is somewhat reversed when it comes to high schoolers. The ages of 18-21 are a time of tremendous growth for most baseball players--far more so than the ages of 21-24. A high school player from a cold-weather state might not have had the same chance to refine his skills that a player from Florida or Texas would, but in the right organization, he'll have plenty of time to do so in the lower minors. The strategy of shifting scouting resources toward colder-weather states might work especially well for an organization that is primarily focused on collegiate players. If it's true that high school players are still overrated as a group--and I think they are--it's sensible to focus on those high schoolers that are relatively underrepresented, in areas where the saturation of scouts from other organizations will be less fierce. Major league teams are constantly looking for new sources of untapped talent, whether it's the Far East or Asia or even Europe. One of the most fruitful such areas may be right in their backyards.
Nate Silver is an author of Baseball Prospectus. 0 comments have been left for this article.
|