A Brief Statistical Analysis for Prediction of the 2015 WSOP Main Event Field Size
Recently, Editor-in-Chief Donnie Peters wrote an interesting article on titled, "Will the 2015 WSOP Main Event Field Size Increase or Decrease Over 2014?" In the article, Peters shares a side-by-side comparison of the World Series of Poker Main Event field size with the field sizes of the four other major poker tournaments: the main events of the PokerStars Caribbean Adventure, Aussie Millions, World Poker Tour, and the European Poker Tour.
While Peters presents a very intuitive explanation of these data and the conclusion that, based on these numbers, we might expect a downturn in the 2015 WSOP Main Event field size, I wanted to explore these data further with a formal analysis. First, Peters notes that, over the past decade, the Aussie Millions is the best predictor in terms of WSOP field sizes. Below is a scatterplot of WSOP versus Aussie Millions field sizes, and with the exception of the outlying year, 2006 (when the WSOP peaked at 8,773 entrants), the trend is indeed fairly consistent and linear.
Now, in 2015, we have already observed that the Aussie Millions Main Event only gathered 648 entrants. This is a dip from last year, but as Peters notes, it's a very small one (2014 saw 668 entrants), so it might be hard to conclude anything from this. However, Peters also notes that each of the other major events have experienced smaller numbers than 2014 as well. It is possible analyse this formally, by running a statistical model known as Multiple Linear Regression. From this, we can obtain a mathematical prediction for the number of WSOP entrants based on all of the other events combined, so I thought it would be interesting to do that here.
First, the results.
My analysis predicts that the 2015 WSOP Main Event field size will be approximately 6,437. This would be a very modest decrease from last year by about 250 players, or approximately 3-4 percent.
There are several caveats to this. Most notably, this analysis is based exclusively on the numbers from each of the other four events. It does not, for example, incorporate any marketing strategies that the WSOP might be doing to increase their numbers, or any other factors specific to the WSOP that may be at play this year as Peters discusses in his article. Additionally, the sample size is pretty small, as we only have 10 years of data from all five tournaments. Furthermore, 2006 is indeed a strange year. With 8,773 players, this is the largest WSOP field so far, and it was also the year that the Unlawful Internet Gambling Enforcement Act (UIGEA) was passed. For this latter reason, I decided to remove 2005 and 2006 from the analysis. Anyone who has taken a statistics course should know that removing outliers from a dataset is a dubious proposition without proper justification, as you typically want to let all of the data speak. In this case, however, I believe that the landscape of poker changed significantly enough with the passing of UIGEA, most notably because online poker sites were no longer able to directly buy-in their satellite winners into the WSOP.
Despite all of the caveats, I believe that 6,437 is still an interesting figure, in the sense that it uses the information from the four other major tournaments in a strictly mathematical manner. It also confirms Peters' intuitive conclusion that, based solely from the numbers, we should expect a slight decrease in the number of entrants in the WSOP this year.
To briefly explain how the methodology works, consider the scatterplot above, with WSOP-versus-Aussie Millions field sizes. Visually, we notice an increasing trend — that is, as Aussie Millions field size increases, WSOP field size tends to increase as well. We could quantify this by fitting a line to the data. The slope of this line would tell us exactly how much we would expect WSOP field size to increase, for every one-person increase in Aussie Millions field size.
Now, Multiple Linear Regression uses exactly this same idea, but in multiple dimensions. It is difficult to visualize a scatterplot in more than two or three dimensions, but mathematically, we essentially have one dimension for every variable in the dataset. In this case, we have five variables (each tournament), so we have five dimensions. Then, we fit a line (or technically, a hyperplane, since we are in higher dimensions) to the data, and there is a slope estimate corresponding to each of the four predictor tournaments. From the resulting estimates, we can then determine the predicted 2015 WSOP Main Event field size based on the field sizes from all of the other major tournaments that have already happened this year.
So, my analysis predicts a 2015 WSOP Main Event field size of 6,437. As for my opinion, since I am actively trying to win a seat myself this year, I'll take the over on that.