Why The Pundits Don't Understand Polling

by kendg8r

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Monday, May. 12, 2008 Monday, May. 12, 2008 at 6:22:34am PDT

Alright, so one of my pet peeves about modern campaigns - and especially modern media coverage of those campaigns - is the relentless obsession over the horserace numbers in polling. Rarely do individual numbers matter so much as the trend, especially if that trend is long term and/or noticed across polling firms.

Yet, idiotic "professional" pundits always seem to get how polling works completely wrong. Take Stu Rothenberg's latest assessment, for example.

His conclusion:

Everyone seems to conduct polls these days, but not everyone conducts good ones. That’s the message I drew after reviewing dozens of presidential polls conducted from Iowa through Indiana and North Carolina.

And here's what he had to say about the "worst" pollster:

The worst-performing poll has been Suffolk.

Suffolk University’s pre-primary survey correctly predicted the winner in only five of nine contests. It was wrong in both New Hampshire primaries, the California Democratic primary and, incredibly, the Democratic primary in Massachusetts, the state where the university is located.

The other five contests in which Suffolk polled, the results were quite good, within a couple of points of the actual results. But in polling, being right about half the time isn’t a record to be proud of.

So what is he saying here? That to be a good pollster you need to not just predict the winner but be within a few points of the actual result? How idiotic is that?

Well, you might not know all the reasons why I think that is a stupid, polling-illiterate, manner for assessing the utility of such political surveys. So, why don't I go through the ones I can think of for you?

Every poll has a margin of error, and it's bigger than usually reported. When reporters say a poll has a 3-point margin of error and a candidate is ahead by 4 points, that doesn't mean he's safe - that just means there is a greater probability that he's ahead than that he's behind, but it doesn't rule out the possibility that he's actually behind. Margins work in both directions for both candidates. So a 50-46 survey results that's a MoE of +/- 3% means Candidate A has a range (the pollster believes) of between 47-53 and Candidate B is somewhere between 43-49. The extreme ends here shows the pollster believes Candidate A could be anywhere from ahead 53-43 or behind 47-49. That's a huge range of possible outcomes - so no reporter worth their salary should be claiming the candidate is "outside the margin of error" in that case.

Even then, some polls are outliers. One rarely reported aspect of statistical surveys is that beyond the margin of error, there is a confidence level - typically that the surveyor believes the result is accurate 95% of the time. That means, if 20 pollsters polled this race at the same time, there's a good possibility that at least 1 of the pollsters will have a survey result outside of the margin of error. It just happens. Not even random samples can avoid outlier results.

Undecideds, when they do end up voting, rarely break exactly the exact same way the rest of the population does. We saw this with Clinton-Obama; most undecided voters opted in the end for the "safe" choice, Hillary, even when most other primary voters were persuaded to side with Barack. And often times, well-known incumbents who are having trouble "close the deal" will see their margins shrink on Election Day as the undecideds side with the challenger. This differential between the decideds and the undecideds will skew your result. Some pollsters try to minimize this by artifically forcing people to choose who they "lean" to, but that adds another variable for unpredictability into your survey (see below).

Wilder Effect: Not everyone will vote for the person they claim they will. This year, most pundits who mention this do so in the context of the "Wilder Effect", that many whites will simply not vote for a black candidate, but are too embarassed to say so in public. But this phenomenon can happen in any electoral survey, for any reason, including as simple a thing as not wanting to let your family, who can hear you on the phone, know you're voting against the rest of them.

Turnout trumps all. On Election Day, turnout can vary widely. College students could head to the beach and forget to vote; seniors could have a bad day and miss their ride to the polls; the boss could make you work late, etc. In the aggregate, this may not matter a whole lot. But it does matter. Many Republicans will try to depress African American turnout, or confuse it, to dilute its impact on the results. Democrats, likewise, might do everything they can to make sure seniors vote, inflating their proportional turnout. No pollster can perfectly gauge turnout among demographic and geographic groups until Election Day - they can try, but weighting results according to what you predict might happen adds yet another factors that can skew results.

Polling is not a tool to make predictions. Any poll done on a day other than Election Day is a fool's errand for making predictions, and even exit polls are notoriously off the mark (it all depends on who is willing to stand outside and answer questions). You shouldn't take a horse-race number and expect the results to match, especially when the survey is conducted several days before Election Day and (as noted above) could really be far off the mark.

If we wanted to make predictions using polling, perhaps instead of the horserace numbers, we should use a probability factor (taking the horserace and the margin of error). Using our example, the range of expected results is anywhere from +10 to -2 for Candidate A. So we might want to say Candidate A has an 83% probability of being in the lead as of today.

But that might not be nearly as exciting (or as good for ratings) as saying he is barely ahead by 4 points.

And given these limitations, polling is not necessarily that good for the topsheet numbers. But it can be tremendously useful when examining crosstabs, and for getting a general sense of the mindset of the voters (by asking other non-electoral questions, as well). And if the same pollster does 3 more polls and we see Candidate B slowly gaining ground and maybe overtaking Candidate A, not only can we look at the probability that he's ahead, but now we have a suggestion of a trend in Candidate B's favor.

And trends/momentum are much more indicative of how well Election Day may go (see Barack Obama and the Indiana primary) than any single polling result.

Sorry, Stu, I know you get paid for your opinions, but your just don't know jack about polling, and I'd appreciate it if you stopped bad-mouthing the polling profession.