A Bayesian Approach
The ideas and methods used in this site all stem from this paper written by Steven E. Rigdon, Sheldon H. Jacobson, Wendy K. Tam Cho, Edward C. Sewell, and Christopher J. Rigdon. This paper provides an analysis of the prediction results from the 2008 Presidential Election. The relevant citations are:
- Rigdon, S. E., Jacobson, S. H., Cho, W. K. T., Sewell, E. C., & Rigdon, C. J. (2009). A Bayesian Prediction Model for the U.S. Presidential Election. American Politics Research, 37(4), 700-724.
- Rigdon, S. E., Jacobson, S. H., Cho, W. K. T., Sewell, E. C., & Rigdon, C. J. (2010). An Analysis of Daily Predictions for the 2008 United States Presidential Election. Case Studies in Business, Industry, and Government Statistics, 4(1), 1-8.
Background and Motivation
The results from the 2000 and 2004 United States Presidential Election suggested that it can be difficult to forecast the winner of the presidential election based on popular vote. In fact, it is possible that the popular vote and the electoral college vote can lead to significantly different results.
To address this, Rigdon et. al created a new forecast model based on the electoral college vote to determine the winner of the next presidential election. This model was used to track and analyze the 2008 and 2012 presidential elections. Additionally, in 2012 the model was extended to handle Senate races in order to analyze which party would secure control of the U.S. Senate. (See the FAQ for why House forecasts are not included.)
For the 2014 midterm elections, the Election Analytics Team at the University of Illinois has redesigned the website in order to make the modeling process as transparent as possible. New features will be added to the site over the next several months prior to the election, and forecasts will be posted as new polling data is made available.
Technical Analysis and Assumptions
The mathematical model employs Bayesian estimators that use available state poll results (at present, this is being taken from Rasmussen, Survey USA, and Quinnipiac, among others) to determine the probability that each presidential candidate will win each of the states (or the probability that each political party will win the Senate race in each state). These state-by-state probabilities are then used in a dynamic programming algorithm to determine a probability distribution for the number of Electoral College votes each candidate will receive (or Senate seats that each party will secure).
Polling data for each state is weighted based on how recently the poll was conducted. If three or more polls are available within the past two weeks, then polls within the past week have a weighting factor of 1, polls between one and two weeks old have a weighting factor of 0.5, and polls older than two weeks have a weighting factor of 0. If two or fewer polls are available within the most recent two weeks, then the three most recent polls are used, with polls within the past week have a weighting factor of 1 and polls older than one week have a weighting factor of 0.5. If no polls comparing the two candidates are available for a state, the results of the last election are used to estimate the outcome in the upcoming election.
Polls often show figures that are for likely voters, as opposed to registered voters. If a greater number of registered voters show up to vote on election day, then the poll numbers may not be representative of the actual voters.
Limitations of Results
The results presented are a direct function of the quality of the state polling data being used. Any biases in this data can lead to misleading and false results, and hence, invalid conclusions. The results of this analysis have been obtained as part of an academic, educational exercise to demonstrate the power of statistics and operations research to analyze data of significant importance and practical interest.