<< Previous Article
The BP Wayback Machine... (11/08)

<< Previous Column
Baseball ProGUESTus: H... (11/04)

Next Column >>
Baseball ProGUESTus: T... (11/11)

Next Article >>

On the Beat: The 2011 ... (11/08)

November 8, 2011

Baseball ProGUESTus

Getting Explicit with Sample Sizes

by Matt Lentzner

Printer-friendly

Believe it or not, most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.

Matt Lentzner has carved out a (very) small niche in the baseball analysis world by examining the intersection of physics and biomechanics. He has presented at the PITCHf/x conference in each of the last two years and has written articles for The Hardball Times, as well as a previous articles for Baseball Prospectus. When he’s not writing, Matt works on his physics-based baseball simulator, which is so awesome and all-encompassing that it will likely never actually be finished, though it does provide the inspiration for most of his articles and presentations. In real life, he’s an IT Director at a small financial consulting company in the Silicon Valley and also runs a physical training gym in his backyard on the weekends.

Yes, the title sounds vaguely pornographic, but what I’m getting at is pretty serious. Well, as serious as one can get when talking about baseball stats, anyway. Let me regale you with a story that is rehashed almost daily in baseball articles. It should sound familiar.

I was reading an article by Dan Lependorf, a very sharp sabermetrically inclined writer for Athletics Nation. It was about Michael Choice, a highly touted prospect with the Oakland A’s. Choice turned in a solid performance in High-A this season and has looked like a monster in the Arizona Fall League.

A+ Stockton (2011): .285/.376/.542
AFL Phoenix (2011): .333/.424/.745

I haven’t been keeping up with the AFL that closely, so being the stats-oriented guy that I am, when I saw the numbers I was thinking, “Yeah, that’s pretty awesome, but how many PA’s (plate appearances) is that?”

And right at that moment it hit me.

If you want to call yourself a sabermetrician, then you know that a statistic doesn’t mean anything without a sufficient sample size. Batting .400 over 10 plate appearances doesn’t mean anything. Batting .400 over a season is something amazing. We all intuitively know that.

But here’s what I think needs to happen: every stat printed should have the sample size as PAs included. Always. This is what communicates the credibility of your number. So now Michael Choice’s lines look like this:

A+ Stockton (2011): .285/.376/.542 [542]
AFL Phoenix (2011): .333/.424/.745 [62]

We know from Russell Carleton’s (AKA Pizza Cutter) seminal work (gory details here) on sample size stability that OBP and SLG stabilize at 500 PAs. In other words, the performance that Choice logged in Stockton was worth considering, while the fireworks he’s been displaying in Phoenix are not very meaningful at all (12 percent of the required PAs). In fact, there are 11 other hitters bopping with an OPS over 1.000 at this time, and Choice is in the middle of the pack. It’s not even a special performance.

But there’s more to this story.

One of the knocks on Choice is his high strikeout numbers, the worry being that he will strike out too often to be an effective major-league hitter. He already strikes out a lot in High-A, and his strikeout rate is expected to rise as he faces better pitching in the upper levels. Here’s the strikeout performance he’s put on in Stockton and Phoenix:

A+ Stockton (2011): 24.7 K% [542]
AFL Phoenix (2011): 15.5 K% [62]

Much, much better in Phoenix, but with that small sample size, is it meaningful? As it turns out, strikeout rate stabilizes a lot faster than OBP and SLG—in only 150 PA. Check this out:

AFL Phoenix (2011): 15.5 K% [62/150]

This is much more meaningful. He’s over 40 percent of the way to a “real” number. Assuming Choice plays most of the games remaining on the schedule, he should pass 100 PAs. Although he won’t reach 150, whatever that K rate ends up being will be a heck of a lot more reliable that anything his OBP and SLG have to say. Even better would be a measurement of his contact rate, which stabilizes in only 100 PA. That’s a stat that is stable in the context of AFL baseball’s short schedule.

Taking the above into account, I’d like to amend my original statement. Sample size in PAs should always be printed. The assumed stable sample size should be 500 PAs. Otherwise, the sample should appear in a format of [sample/stability number] format.

Here are the stable sample sizes per Mr. Carleton’s original work on the subject:

50 PA: Swing %
100 PA: Contact Rate
150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA
200 PA: Walk Rate, Groundball Rate, GB/FB
250 PA: Flyball Rate
300 PA: Home Run Rate, HR/FB
500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate
550 PA: ISO

The warm summer months seem far away now, but before long we will have bid goodbye to the Hot Stove League and moved on to spring training and the regular season schedule. Hopefully, we won’t see too many “triple-S” or “SSS” (Small Sample Size) warnings when next season rolls around. They won’t be necessary, since the samples will be explicitly stated. And maybe we can discourage people from using stats incorrectly to try to find meaning where very little exists.

Related Content: Michael Choice, Phoenix, Small Sample Size, AFL, Sample Size

14 comments have been left for this article.

BP Comment Quick Links

code of conduct

sykojohnny

(225)

This is one of the most important articles I have ever read in BP and I bet that it will also be one of the most overlooked. No stat should even be considered until the number of plate appearances is known.

Nov 08, 2011 08:20 AM

link

rating: 2

formersd

(4977)

I agree with the general premise of the article.

One other note is that the pitching in the AFL is considered inferior, so I suspect strikeouts rates for all players drop. If I'm correct in this assumption, conditions in which the improve strikeout rate is occuring should also be checked against the baseline for strikeout rate. If the baseline has moved significantly, you need to consider that factor as well.

Nov 08, 2011 08:50 AM

link

rating: 3

everettcase

(51916)

Great article

Nov 08, 2011 08:59 AM

link

rating: 1

John Carter

(22689)

I like the idea of showing sample size, although I guess it is something was we generally have a feel for anyway, because we know the length of a season. If the stats are based on part of a season, then indicating the sample size is necessary.

Now tacking the number of Plate Appearances where the stat becomes stable is getting a bit unnecessarily cumbersome. I think we understand that the more often the thing occurs that the stat is based on, the more stable the stat is given the same number of plate appearances. There is nothing magical or final about those stable indicator numbers. They are a general guideline that I think we have a good instinct for.

Nov 08, 2011 10:13 AM

link

rating: -2

Russell A. Carleton

BP staff

(35870)

Thanks for the "gory details" nod.

Nov 08, 2011 10:52 AM

link

Matt Lentzner

(16092)

My pleasure. I didn't know you were still knocking around here.

Nov 08, 2011 20:27 PM

link

rating: 0

Ben Lindbergh

BP staff

(37618)

He pops in periodically to remind us how much we miss him.

Nov 08, 2011 22:29 PM

link

Brian Oakchunas

(9790)

Line drive rate stabilizes at 150? Seems like something is wrong there.

Nov 08, 2011 11:46 AM

link

rating: 0

Russell A. Carleton

BP staff

(35870)

There probably is. Those are Retrosheet data 2003-2006, and Colin Wyers has previously (and elegantly) pointed out the flaws in RS batted ball data. Also, that's LD/PA, not per BIP. That probably makes it look a little more stable.

Nov 08, 2011 18:57 PM

link

ScottyB

(23917)

On a related note, would it be possible/useful to have things like standard deviations and/or confidence intervals for some of our more common sabremetric statistics?

Nov 08, 2011 18:45 PM

link

rating: 0

Ben Lindbergh

BP staff

(37618)

Hey, go easy on us! As someone who has to write an awful lot of triple-slash lines, the prospect of having to cite PA, SD, and CI multiple times an article is a little scary, statistically sound as it might be.

Nov 08, 2011 22:29 PM

link

Matt Lentzner

(16092)

Thanks for all the comments, guys.

Nov 08, 2011 20:28 PM

link

rating: 0

Lloyd Cole

(11313)

Reading this (and Mr. Carleton's excellent article) I started thinking about factors that might distort the rate at which particular statistics would stabilize. Though these factors might not be important for GROUPS of players, could they be important for INDIVIDUAL players? I'm thinking about things such as
--players changing teams (especially from one league to another, or one type of park environment to another) such that their statistics might be radically different before and after the change;
--players changing roles: it's frequently noted that young players don't see good results until they get a substantial amount of time in a full-time role, so would their tendencies be accurately reflected in a 50-PA to 150-PA sample size? might they show radically different results in their next 400 full-time PAs, even for stats like Strikeout Rate?;
--injuries: if you combine the statistics for a power hitter before and after a wrist injury (or, maybe more interestingly, partly after a wrist injury and then once it's had a chance to heal fully) to get a 500 PA sample for stabilizing sample size on SLG, how predictive is the result?
I'm probably missing something here, since my knowledge of statistics is laughably small, but I'd be interested in your thoughts....
And thanks for the reminder on this VERY important subject.

Nov 09, 2011 13:10 PM

link

rating: 0

Shadetree42

(33584)

I'm pretty sure most people know AFL stats are going to be a tiny number of PA. The only reason to mention PAs is if the number is less than expected for the relevant league. And if so, it's just as easy to write, "in 130 PA" as it is "[130]". I.e., not sure what this article is trying to establish that couldn't have been stated in a single sentence.

Feb 07, 2012 22:04 PM

link

rating: 0

You must be a Premium subscriber or have a Fantasy subscription to post a comment.
Not a subscriber? Sign up today!

<< Previous Article
The BP Wayback Machine... (11/08)

<< Previous Column
Baseball ProGUESTus: H... (11/04)

Next Column >>
Baseball ProGUESTus: T... (11/11)

Next Article >>

On the Beat: The 2011 ... (11/08)

RECENTLY AT BASEBALL PROSPECTUS Playoff Prospectus: Come Undone BP En Espanol: Previa de la NLCS: Cubs vs. D... Playoff Prospectus: How Did This Team Get Ma... Playoff Prospectus: Too Slow, Too Late Playoff Prospectus: PECOTA Odds and ALCS Gam... Playoff Prospectus: PECOTA Odds and NLCS Gam... Playoff Prospectus: NLCS Preview: Cubs vs. D...	MORE FROM NOVEMBER 8, 2011 The BP Broadside: Tumbling in the Twin Citie... On the Beat: The 2011 All-MLB Team The BP Wayback Machine: When Good GMs Go Bad
MORE BY MATT LENTZNER 2012-06-19 - Baseball ProGUESTus: See the Ball, Hit the B... 2011-11-08 - Baseball ProGUESTus: Getting Explicit with S... 2011-09-30 - Baseball ProGUESTus: A New Take on Plate Dis... 2011-03-18 - Baseball ProGUESTus: Looking at Pitches Thro... More...	MORE BASEBALL PROGUESTUS 2011-11-23 - Baseball ProGUESTus: The Best Bush League Ba... 2011-11-18 - Baseball ProGUESTus: Why Having a Quick Hook... 2011-11-11 - Baseball ProGUESTus: The Language of the Hot... 2011-11-08 - Baseball ProGUESTus: Getting Explicit with S... 2011-11-04 - Baseball ProGUESTus: Hard Truths at Triple-A 2011-10-31 - Baseball ProGUESTus: Silly Goose: Mariano Ri... 2011-10-20 - Baseball ProGUESTus: A League of Their Own? More...