CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here to subscribe
<< Previous Article
Premium Article Prospects Will Break Y... (04/26)
<< Previous Column
The BP Wayback Machine... (04/20)
Next Column >>
The BP Wayback Machine... (05/04)
Next Article >>
What You Need to Know:... (04/27)

April 27, 2012

The BP Wayback Machine

Royal Flush

by James Click

While looking toward the future with our comprehensive slate of current content, we'd also like to recognize our rich past by drawing upon our extensive (and mostly free) online archive of work dating back to 1997. In an effort to highlight the best of what's gone before, we'll be bringing you a weekly blast from BP's past, introducing or re-introducing you to some of the most informative and entertaining authors who have passed through our virtual halls. If you have fond recollections of a BP piece that you'd like to nominate for re-exposure to a wider audiencesend us your suggestion.

The Royals ended a 12-game losing streak on Wednesday, but that wasn't nearly their longest in recent memory. To refresh your memory on the Royals' futility and the odds of long losing streaks, take another look at the article reproduced below, which originally ran as a "Crooked Numbers" column on August 18, 2005.
 

There's bad, there's the Colorado Rockies, and then there's the Kansas City Royals. If you're into the Jayson Stark "Useless Info" columns, you could easily notch thousands of words about how bad the Royals have been for the past decade or more, a situation only highlighted by their recent losing streak. It's a tough time to be a Royals fan and if you're one of the few, the proud, read on and perhaps you'll feel a little bit better about your team.

To start, let's get some perspective. The Royals' streak of 18 straight losses is not the worst run of baseball of all time. The worst losing streak in the major leagues since 1901 was the 1961 Phillies who managed to lose 23 games in a row from 7/29/61 to 8/20/61. Interestingly, it could have been a lot worse; the Phillies lost five in a row just before the streak, so they actually lost 28 of 29 games in what may very well be the worst month any team has ever had. Here are the rest of the worst:

Year    Team                    Games
1961    Philadelphia Phillies   23
1988    Baltimore Orioles       21
1969    Montreal Expos          20
1943    Philadelphia A's        20
1916    Philadelphia A's        20
1906    Boston Red Sox          20
1975    Detroit Tigers          19
1914    Cincinnati Reds         19
1906    Boston Braves           19

The Royals are on the cusp of greatness, but they're not quite there yet. But how bad is this streak? People have a tendency to grasp onto streaks because they're easily quantified. A team that's lost 15 games in a row is clearly worse than a team that's lost 12 or 10 games in a row. But streaks are as easily broken as they are quantified. Take baseball's greatest streak of all time: Joe DiMaggio's 56-game hitting streak. As many of you know, after his streak was broken DiMaggio hit in another 16 games straight meaning the Yankee Clipper notched a hit in 72 of 73 games. While it's easy to say that DiMaggio's 56-game hit streak was more impressive and improbable than Pete Rose's 44-game streak, but what's more impressive: hitting in 56 games in a row or hitting in 72 of 73 games?

To determine this, we need to get into some binomial distributions. If we assume that DiMaggio had a "true" probability of getting a hit in a game, then the question becomes quite simple. However, that's not quite true because we should instead assume he had a probability of getting a hit in an at bat; as such, things becomes much more complicated. Instead, let's get back to teams and winning games. The odds that a team with a winning percentage w will win any x number of games in a row is simply w^x. Conversely, the odds that they will lose any y number of games in a row is (1-w)^y. This is a binomial distribution, but a very simple one.

If, however, the goal is to determine if a team will win at least 2 of 3 games, the formula becomes more complex because more situations meet the standards for success. For example, if the team wins all three games, wins the first two, wins the second two, or wins the first and last games, all three situations must be counted. The odds of the team winning exactly two of the three games--w2 * (1-w)--must be added to the odds that they will win all three--w3. But since there are three ways in which the team can win two out of three games, that result has to be multiplied by three.

However, the key to the puzzle is Pascal's Triangle, a tool that reveals the binomial coefficient by which each result must be multiplied--three, in the case above. Essentially, the triangle shows how many different ways the final counted result can be achieved by different distributions of the binomial choice. There are three different ways the team can win two games and lose one, but only way in which they can win all three. This is also referred to as "x choose y"--essentially, if one is faced with the decision to choose y games out of x total games, how many possible combinations add up to y.

Getting back to the Royals, we run into another problem when estimating the probability of their losing streak or comparing it to other losing stretches: what is the Royals' probability of winning an individual game? In baseball, we typically assume that this probability is a team's winning percentage. Isn't this was the whole regular season is about, determining who's the best team by who has the highest probability of winning a baseball game? But as Keith Woolner reminded us before the season, 162 games isn't enough time to properly discern a team's "true" winning percentage, the probability that they will win any given game.

This is to say nothing of the fact that a team's probability of winning any given game is not a constant. Winning probability is affected by any number of factors: whether the team is home or away, who the starting pitchers are, who's in the lineup, who the opposing team is, and any number of other factors. We generally like to assume that those kinds of breaks even out of the course of 162 games, but if they did, then BP's Quality of Batters Faced and Quality of Pitchers Faced reports and all that hang-wringing about the unbalanced schedule and the wild card would be for naught.

While a team's winning percentage over 162 is the best guess we have about their true probability, we must be admit that there is an overwhelming probability that that number is wrong. It's going to be close, but the odds that the next 162 games would fall exactly the same as the previous 162 is miniscule. It's possible that by using the full season's winning percentage as a guide for a team's true probability of winning any single game, we're making such stretches of losing appear easier, but using only those games not involved in the streak would be an arbitrary removal of data. Thus, the full-season winning percentage is as close as we can get, so we'll stick with that.

Caveats aside, let's see what we can do about estimating just how bad the Royals are. First, let's take a look at the probability that a team with a given probability of winning each individual game will lose a certain number of games in a row.

In this graph, there are five hypothetical teams with winning percentages between .500 and .300. Note that their odds of losing the first game are exactly the inverse of their winning percentages, as we'd expect. As the losses pile up, the probabilities decrease dramatically, to the point that by the time we get to 13 or 14 losses, it's nearly impossible to tell the difference between a .500 team and a .300 team. This is encouraging because it means that with streaks of the Royals' magnitude, the winning probability of the team doesn't make that much difference and we can continue knowing that our errors will be small in this regard.

Now, let's assume for a minute that the Royals are actually a .319 team (their current winning percentage). What are the odds that they'll lose any given 18 games in a row? By binomial distribution, we know that that probability is .000984 or approximately 1015.5:1. That seems very impressive, but that's only the probability that they'll lose any given stretch of 18 games. A 162-game season can be viewed as 144 separate 18-game opportunities to lose 18 games in a row. While the Royals chances of losing any given 18-games in 1015.51:, their chances of losing any stretch of 18 games over the course of a 162 game season is actually closer to 6.6:1. What's more, the Royals, given their .319 winning percentage, had a 50:50 chance of losing 13 games in a row at some point during the season.

How does that compare to historical streaks? Obviously it's not as bad as the '61 Phillies, but it's possible that the Royals are breaking up several losing streaks with lone wins to make things look better. So instead of looking simply at streaks, let's see how bad the Royals are over a given stretch of games. For example, the Royals are 14-40 over their last 54 games, but let's round it off to 50--in which they were 13-37. Compare that to the worst 50 game stretches since 1901:

YEAR    TEAM    W   L   Win_Pct Prob    Ratio   InSeason    Ratio
1916    PHA     4   46  .235    0.41%   240.6   37.15%      1.7
1937    PHA     7   43  .358    0.06%   1796.7  6.04%       15.6
1932    BOS     8   42  .279    3.74%   25.7    98.60%      0.0
1915    PHA     8   42  .283    3.31%   29.2    97.70%      0.0
1961    PHI     8   42  .305    1.51%   65.2    81.80%      0.2
2004    ARI     8   42  .315    1.05%   94.1    69.38%      0.4
1943    PHA     8   42  .318    0.92%   107.3   64.60%      0.5
1949    WS1     8   42  .325    0.71%   138.9   52.22%      0.8
1996    DET     8   42  .327    0.65%   153.5   51.67%      0.9
1979    OAK     8   42  .333    0.50%   197.7   43.17%      1.3
1907    SLN     8   42  .333    0.50%   197.7   43.17%      1.3
1923    BSN     8   42  .351    0.24%   413.6   23.70%      3.2
1982    MIN     8   42  .370    0.10%   1011.8  10.47%      8.5

The Royals are not even close. Getting back to our original question, what's more impressive: the Philadelphia A's going 4-46 over 50 games or the Philadelphia Phillies losing 23 in a row in 1961? Getting back to our binomial distributions, the probability of a .235 team--the '16 A's--winning 4 games or fewer in a given 50 game stretch is about 240.6:1 and over a season is a mere 1.7:1. This year's Royals--by virtue of their .319 winning percentage--have a 148.9 chance of matching that feat any time in a season (16,733.9:1 in any-50 game stretch). As mentioned, the odds of the Royals losing 18 games in a row at any point in the season is 6.6:1. Expand that to 50 games and the Royals would have to lose 44 of 50 to match those odds. Furthermore, while there have been several streaks longer than the Royals', only the '37 Athletics and '82 Twins can claim more improbable stretches of bad baseball since 1901.

The Royals' streak is already more improbable than all but 2 stretches of 50 games since 1901 as well as those few teams that notched longer pure streaks than they did. But each game that the Royals' lose makes their stretch more and more improbable, likely vaulting them past those few teams remaining ahead of them. Is there some solace to be taken in the fact that the Royals' improbably bad stretch was over one 18-game stretch and not a 50-game valley? Maybe, but if you're looking for the most improbable losing streak in baseball, the Royals' are certainly making a case.

***

Note: the following is an excerpt from James Click's follow-up article, "Going Streaking," which appeared as a "Crooked Numbers" column on August 25, 2005.

Not unlike the old Sports Illustrated Jinx, it seems that as soon as we talk about something here at BP, things turn around. Jonah Keri covered Sunday's A's game yesterday in his Game of the Week column, but it's safe to say that my last two columns--about the Royals' losing streak and the A's winning ways--have made large U-turns in the last week. The Royals' managed to finally break out of their near-record slump and it's this subject that deserves a little more of our attention.

Last week's column was a protracted discussion of the Royals' losing streak, its historical place, and a discussion of its likelihood. Unfortunately--as many readers and one fellow BP author pointed out--there was an error in the discussion, specifically this paragraph:

"Now, let's assume for a minute that the Royals are actually a .319 team (their current winning percentage). What are the odds that they'll lose any given 18 games in a row? By binomial distribution, we know that that probability is .000984 or approximately 1015.5:1. That seems very impressive, but that's only the probability that they'll lose any given stretch of 18 games. A 162-game season can be viewed as 144 separate 18-game opportunities to lose 18 games in a row. While the Royals chances of losing any given 18-games in 1015.5:1, their chances of losing any stretch of 18 games over the course of a 162 game season is actually closer to 6.6:1. What's more, the Royals, given their .319 winning percentage, had a 50:50 chance of losing 13 games in a row at some point during the season."

The probability of the Royals' losing any particular 18 games in a row is right, it is .000984. Furthermore, the odds of them losing 18 games in a row given 144 chances is actually 6.6:1. So if those are both right, where's the error? The problem comes from looking for streaks of exactly 18 games versus streaks of 18 or more. If a team loses 18 games in a row and then loses their next contest, they now have two 18-game losing streaks, overlapping by 17 games. Thus, if you view any losing streak of 18+n games as n+1 streaks of 18-games--as my calculations did--then you're vastly overestimating the likelihood that a given team will lose at least 18 games in a row.

As Rany Jazayerli pointed out, "In other words, it's not accurate to say that the odds of not losing 18 in a row on Day X is (1-.000984) = .999016, and (.999016)^144 = the odds of not losing 18 in a row over an entire season. On Opening Day, the odds of starting an 18-game losing streak is .000984; from that day on, the odds are (.000984) * (.319)." (He also pointed out that a 162-game season provides 145 opportunities to lose 18 games, not 144.)

A more accurate formula to answer the question we were asking--how likely is a team of a given winning percentage to lose a certain number of games in a row at some point during the season?--would be this:

1--(1-((1-W%)^G)*W%)^(163-G)

Where W% is the team's "true" winning percentage and G is the number of games in the streak. Let's break this down into pieces to get a better idea of what's going on. 1-W% is the odds a team will lose a game, so (1-W%)^G is the odds they will lose the required number of games in a row. Then, 1--((1-W%)^G)*W% is the likelihood that a team will start a losing streak of the required number of games after winning the game before. This is the key component that Rany identified that prevents us from counting streaks of G+1 games as two streaks of G games and thus inappropriately doubling the odds of a team losing that many games in a row. Raising 1--((1-W%)^G)*W% to the power of (163-G) gives us the odds that a team will not encounter the required streak and thus we finish by subtracting from one. (We use 163 instead of 162 because any number of games X is X+1 chances to have the required streak; for example, if we wanted to know how many chances a team has to lose one game in two games, we wouldn't raise it to 2-1, we'd raise it to 3-1 since there are two chances to lose.) In effect, it's the same formula from last week, but we've multiplied the odds of the streak in any given number of games by the team's winning percentage before running it for the full season.

Whew... Got all that? Now, what can we say about the Royals and their eventual 19-game losing streak? Using this formula, let's see what chances the Royals have of losing a given number of games or more in a row over the course of a 162 game season given their shiny new .325 winning percentage:

 W%      Streak     Odds     Season Odds
.325        1       67.480%     100.000%
.325        2       45.535%     100.000%
.325        3       30.727%     100.000%
.325        4       20.734%      99.998%
.325        5       13.992%      99.933%
.325        6        9.441%      99.229%
.325        7        6.371%      96.104%
.325        8        4.299%      88.562%
.325        9        2.901%      76.550%
.325        10       1.958%      62.121%
.325        11       1.321%      47.800%
.325        12       0.891%      35.304%
.325        13       0.602%      25.305%
.325        14       0.406%      17.757%
.325        15       0.274%      12.278%
.325        16       0.185%       8.404%
.325        17       0.125%       5.713%
.325        18       0.084%       3.865%
.325        19       0.057%       2.607%
.325        20       0.038%       1.754%
.325        21       0.026%       1.179%
.325        22       0.017%       0.791%
.325        23       0.012%       0.531%
.325        24       0.008%       0.356%
.325        25       0.005%       0.239%

As opposed to last week when I stated the odds of the Royals losing at least 18 in a row was 6.6:1, now we can see that it's 3.865%, or closer to 24.9:1. Given the relative infrequency of losing streaks of this magnitude, those odds fit much better with the actual results over the history of baseball.

There's still more than can be done with understanding streakiness in baseball. Keith Woolner pointed out that "Another problem with the binomial model of game outcomes used in the article that implicitly assumes that game outcomes are independent within a streak." Given the nature of baseball scheduling, starting pitcher rotations, injuries, and any number of other factors, a team's winning percentage is likely to vary wildly when looking at each individual game or series. Determining the improbability of a streak depends heavily on a team's winning percentage and absent that, it's difficult to say just how unlikely a sudden nose dive is. Regardless of the actual probability, the Royals' 19-game losing streak was one of the longest in baseball history and it's a great deal more improbable than we found last week.

0 comments have been left for this article.

<< Previous Article
Premium Article Prospects Will Break Y... (04/26)
<< Previous Column
The BP Wayback Machine... (04/20)
Next Column >>
The BP Wayback Machine... (05/04)
Next Article >>
What You Need to Know:... (04/27)

RECENTLY AT BASEBALL PROSPECTUS
Playoff Prospectus: Come Undone
BP En Espanol: Previa de la NLCS: Cubs vs. D...
Playoff Prospectus: How Did This Team Get Ma...
Playoff Prospectus: Too Slow, Too Late
Premium Article Playoff Prospectus: PECOTA Odds and ALCS Gam...
Premium Article Playoff Prospectus: PECOTA Odds and NLCS Gam...
Playoff Prospectus: NLCS Preview: Cubs vs. D...

MORE FROM APRIL 27, 2012
Premium Article The Stats Go Marching In: Scoring Runs, Revi...
Premium Article Pebble Hunting: The Best Pitches Thrown This...
Fantasy Article Value Picks: Outfielders for 4/27/12
Fantasy Article Weekly Planner: Week Five
Premium Article Collateral Damage Daily: Friday, April 27
Premium Article The Prospectus Hit List: Friday, April 27
What You Need to Know: Friday, April 27

MORE BY JAMES CLICK
2013-11-07 - The BP Wayback Machine: Today's Oxymoron is ...
2012-11-16 - The BP Wayback Machine: What Were They Think...
2012-04-27 - The BP Wayback Machine: Royal Flush
2011-06-30 - The BP Wayback Machine: Un-Stars
2011-04-14 - The BP Wayback Machine: Sizing Up Small Samp...
2006-02-02 - Crooked Numbers: You Stay Classy, San Diego
More...

MORE THE BP WAYBACK MACHINE
2012-05-18 - The BP Wayback Machine: Interleague Insanity
2012-05-11 - The BP Wayback Machine: Making Waves in the ...
2012-05-04 - The BP Wayback Machine: Home-Field Advantage...
2012-04-27 - The BP Wayback Machine: Royal Flush
2012-04-20 - The BP Wayback Machine: Ode to Jamie Moyer
2012-04-13 - The BP Wayback Machine: Curbed Enthusiasm
2012-04-05 - The BP Wayback Machine: Another Opening Day
More...