May 5, 2017
Reconciling the Outliers
This article is going to be a little bit difficult to me. As a Semi-Professional Baseball Expert, it’s my job to balance unfailing and valuable baseball omniscience with a certain level of openness and humility, one that conveys a willingness to learn and grow with the world. It’s a tricky proposition, and if I tend to succeed at either, it’s almost entirely toward the side of the latter. But there’s one facet of baseball that I don’t understand, that I’ve never understood, and it forces me to expose the real possibility of my ignorance. I’ve tackled it in multiple articles before, all of which were received with the warmest of apathy. But it’s a question of human nature, so it never really goes away.
As Zach Crizer pointed out last week, Eric Thames is having a pretty good start to the third act of his career. He’s had a quiet week since, slashing .200/.333/.267 in 21 plate appearances, leading to a strange kind of lull in the happy roar of his impressive 2017 season. But even still, he enters the day fifth in WARP (1.87) and fourth in True Average (.427), exceeding his preseason PECOTA projection of ... well.
Those who enjoy baseball have seemingly been handed a random, early Flag Day present, a new source of excellence and unmitigated joy. Among the dozens of injuries and disappointments, Thames' is one of the seemingly few feel-good stories of the young season. Except that some folks have a hard time feeling good.
I got a late start in sportswriting. Sometimes I curse myself for this, for not getting online when the getting was good. And then a tweet like this (and the many others like it) reminds me of just how pointless and reductive so much of the conversation in the early-to-mid 2000s was. Once the word “steroid” is introduced, there is no nuance, no gray: the player in question either cheated, or didn’t. There’s nothing else to say. The very question was enough to handicap for years the Hall of Fame efforts of Jeff Bagwell and Mike Piazza, the doubt affixing itself to various numbers long after the players themselves, let alone their scandals, faded from memory.
This is what troubles me. I have never been able to reconcile the narrative of Brady Anderson.
Brady Anderson hit 16 home runs one year. Then he hit 50*. Then he hit 18.
His masterpiece came in 1996, two years before McGwire and Sosa, an introduction to the Steroid Era. It was an unbelievable performance, but we hadn’t yet realized why it was so unbelievable. So when the facade began to crumble, Anderson’s season came down with it, despite the fact that he never, technically, was proven to have done anything wrong.
I don’t want to digress too much, but the condemnation never made much sense to me. If steroids helped Anderson make this incredible jump in performance, why would he immediately stop (and even if he did, why would his numbers immediately follow suit, given what we know about the long-term effects of anabolic enhancements)? Some have whispered that Anderson didn’t like the side effects of his handiwork, but it’s hard to believe he wouldn’t enjoy the side effects to his budget after another 40-plus home runs in a walk year. A more realistic perspective is that nagging injuries in 1997 and beyond masked a more robust peak.
That said, there will never be any shortage of reasons to believe. He became very muscular, although he was perhaps not the first player in history to lift weights. He was open in his love of Creatine, seen as a step into the moral gray area of supplements that would lead to direct cheating, although most players now enjoy the benefits of authorized medication. Perhaps worst of all, we lack the precise data to evaluate whether Anderson made any significant alterations to his swing or approach. But more than anything else, he hit 50 home runs, a number our collective imagination simply would not permit him. It was literally unbelievable.
This is the hidden crime of steroids, one that still haunts us in these drug-tested times: the idea that outliers are impossible. Anything too amazing, too delightful, is a sign not of our own internal rules being tested, our suppositions on the limits of greatness redefined, but that we are being duped. Paranoia wraps around to become cold realism. And somehow, ironically, it’s what people least expect that fails to make them happy.
We’ll never know, barring deathbed confession, whether Anderson actually used anabolic steroids or not. It’s how fans handle the uncertainty where the interesting questions lie. In the poll above, I intentionally avoided any opportunity for nuance, any “probably” or “possibly” answers. Because ultimately, when it comes down to it, the question really is binary: should there be an asterisk next to that 50, or not?
Surely, 75 percent of the respondents didn’t vote yes based on the evidence above; they voted based on the circumstances. It’s an application, conscious or unconscious, of Bayesian probability. Bayes turned the study of probability on its head by adding a new element: rather than just calculating the chance of P(A), we could calculate it based on our knowledge of another, dependent variable. The formula:
With Anderson, the application isn’t just whether he did steroids, but whether he did steroids given that he played in the steroid era. That extra factor drives up the number, and increases the likelihood of his guilt.
But there’s a problem with Bayes. While it can be eye-opening in certain cases (the usual examples are the chances of something true given a positive test result, in an environment with false positives, or cancer rates given age or other factors), it can also be misused. First, it assumes that the two variables are truly dependent, which is an extra step in logic in itself, and second, it assumes the ability to calculate their probability based on each other. The end result is that Bayesian probability can introduce preconceived notions into an equation, by factoring them into the probability of the original event. It’s a means for legitimizing belief by dressing it up as Math, using one’s own probabilities to arrive at the desired outcome.
In this case, when the original result is outside the bounds of belief (given Anderson’s 1995 HR/PA rate, a binomial calculation of his getting 50 home runs in 700 plate appearances was 0.0001 percent), people are going to use Bayes, whether they realize it or not, to add in explanatory factors. And steroids are always, always going to be the easiest factor. We’re driven to this behavior by so many forces: a desire to reject answers that defy explanation, a pride in our projections and estimations, our natural hatred of being lied to. In the end, we’d rather hang a guilty man’s stat line than let one go free.
And in the unsatisfying end, we still don’t know. Eric Thames may get caught tomorrow, or might just go on a 5-for-50 slump; Anderson may write a tell-all; MLB might admit the balls last year really were juiced. That’s the worst thing about writing about probability theory, other than the page views.
And while it’s still each person’s individual prerogative to incorporate all these independent and dependent variables into a worldview, to decide which numbers are real and which players virtuous, I feel compelled to issue a quiet warning. Everyone has the right, and also the unconscious compulsion, to take what we know and predict the future. But when it comes to judging the past, especially legacies that border on moral territory, like cheating ... it might be worth considering employing proof, rather than probability, for assigning judgment. Especially when that probability can be so, so soft.
We’ve just emerged from April, the Age of Eugenio Suarez and the Above-.500 Chicago White Sox, when nothing can be believed. The wide-eyed trendspotting of sportswriters of yore has converted into a modern cynicism toward small samples and aberrations, mostly with justification. Peel your eyes accordingly. But remember: it’s also hard to have an amazing season without an amazing first sixth of a season, and amazing seasons, and outliers, do happen. Don’t just wave off all joy as too good to believe. Find some worth believing in.