More accurate than Nate Silver or Markos—and simple, too

by Daniel Donner for Daily Kos Elections

Thursday, Nov. 15, 2012 Thursday, Nov. 15, 2012 at 9:00:06am PST

Earlier we learned that Markos predicted the results of the closest 2012 presidential races with more accuracy than Nate Silver at 538, who was the best of the polling aggregators and/or modelers.

But guess what—there's actually a free-range, locally produced, antibiotic-free, (relatively) simple model that, to my everlasting surprise, does better than both of them. It's based on the idea that most of the time, the polls are off by a fairly predictable amount related to the partisan lean of the state. I'll call it DRM (Dreaminonempty's Regression Model), 'cause I gotta call it something.

Returning to Markos' post, here's the table he posted with DRM predictions and polling averages for the final 10 days added in (the closest predictions for each state are highlighted):

Dreaminonempty's predictions in close Presidential races were better than polling averages, Markos, or Nate Silver

And, to expand on the comparison, here's the Senate numbers (poll averages include entire month of October in some cases; see below the fold for details):

Dreaminonempty's predictions in close Senate races were better than polling averages, Markos, or Nate Silver

The average error of DRM predictions is lower for both close Senate and presidential races. Not only that, DRM predictions came closest to the actual result in 12 races, while Markos grabbed the honors in nine races, and the polling average was best three times. It looks like we can add another successful, reality-based, Daily-Kos-originated prediction model to the collection!

And the best news? It's easy to make a DRM prediction. See below the fold for simple instructions and analysis of additional races.

A simple explanation: (updated from the comments)
Polling averages often don't predict the election margin (%D-%R) correctly, even if they do predict the winner. Usually, they underestimate the Democrat's performance in Blue states, and underestimate the Republican's performance in Red states. This is likely because, as LNK put it, people tend to conform to their surroundings.

I attempted to correct for this in a very simple manner, in hopes that, on average, the new predictions would be more accurate than the polls alone. The correction involved adding a number to the polling margin in each state based on how 'red' or 'blue' it was.

In the end, the corrected polling numbers were more accurate than the polling numbers alone. The method worked.

Instructions for DRM
1. Look at polls for about the previous month (Oct. 1 onward for a final prediction). Are there more than three polls in the past 10 days? If yes, skip to Step 2. If no, average the Democratic margins (%D-%R) for all polls in the month, and go to Step 3.

2. Do you see a trend in the polls over the past month? If you see a trend, average the Democratic margins (%D-%R) for the polls for the previous 10 days only. Otherwise, average the Democratic margins (%D-%R) for all polls in the month.

3. Find your state on this table. Add the DRM factor you see to your polling average for the margin estimate.

4. Check for red flags. If there are only one or two polls total, or a third party candidate is drawing more than 5 percent of the vote, the estimate could be off by a substantial amount. Also, states with a federal/local party mismatch, like West Virginia, could run into trouble.

Example: In my first post on this subject, I used Elizabeth Warren's Senate race in Massachusetts as an example, so let's return to that race.

The polls for this race from Oct. 1 onward are pretty steady. There are 17 polls, with an average margin of +3.8 points in Warren's favor.

The state table shows Massachusetts with a DRM factor of +4.1 points, yielding a estimated margin of 7.9 points in Warren's favor. The actual result was Warren +7.4 points.

How does the DRM work?
The basic idea is that in red states, Republicans do a little better than the polls say they will, while in blue states, Democrats tend to do a little better than the polls say they will. You can see the relationship here:

Polling errors are a function of Obama's 2008 performance

The regression line is used to generate the DRM factors in the linked table. This regression will be updated after 2012 results are finalized.

How well does this method work?
In my final pre-election post, I said I would consider the model a success if the average error is lower for the predictions than for polling averages alone. As shown above, the DRM model was indeed successful by this measure in close presidential and Senate races. What about the rest?

Below I link to all the predictions and errors, including 538 predictions for comparison to the best of the well-known models. Please note that counting is not complete in many states, so some of these numbers could change. Also, I had a second set of predictions, based not on the regression but the prior performance of polls in each individual state in either presidential races or Senate and Governor races. This I will call DSE (Dreaminonempty's State Errors).

Here are links to predictions and errors for the three sets of races:

President - Predictions and Errors
Senate - Predictions and Errors
Governor - Predictions and Errors

Note that some changes were made after originally posting these predictions as data entry errors were found and corrected.

Generally speaking, the worst DRM predictions were for states with few polls, and some of the races with third party candidates drawing more than 5 percent of the vote.

The first way to test the predictions is to ask which prediction was best most often. By this measure, DRM was by far the best, with the closest prediction more than half the time:

DRM model is most accurate most often in Senate and Presidential races

Another, geekier, way to look at errors is the Root Mean Square Error (RMSE). It's basically a measure of accuracy—and all you need to know is that lower numbers are better.

RMSE errors lowest for DRM for Presidential and Senate races, but not Governors

DRM performs best in presidential and Senate races, but worse than the polling average in gubernatorial races.

Out of curiosity, I redid the above table, but only included states with 10 or more polls, and excluded states with third parties >5 percent. The accuracy of these predictions was clearly better, to no great surprise. The relative performance of the different prediction methods remained about the same.

Indeed, the errors of all methods increase as a function of the number of polls. Here's an example of presidential numbers for DRM:

Errors increase as the number of polls in a state decreases

Questions to address
With all the new 2012 data, I hope to look into the following issues:

Can we do anything to sort out races with third parties?

Is there a reason why sometimes DSE is better than DRM?

Is there a better measure for state partisanship than Obama's 2008 vote share?

Is there an issue with using this method for gubernatorial races?

Hopefully these questions will lead to improvement without sacrificing simplicity.

Final thoughts: (updated from the comments)
This method of prediction is nowhere close to the same level of sophistication as 538. The intent was to try to find something as simple as possible that was better than the polls alone. This is about as simple as you can get - it's only got two inputs. By keeping it simple, it is easy for many people to replicate on their own and and put to their own uses.