In the last post I ended with the phrase 'Beware the crosstabs.' Today I will expand on that.
Here's what happens if you break down the aggregated Daily Kos/SEIU/PPP polling to date by race and age. (For racial groups that are not shown, N is too small for the younger generations.)
Let me be VERY clear here, before I begin discussion, that
some of these numbers are not correct.
The first thing you might notice is that the 18-29 age group appears less supportive of Obama among Hispanics and African-Americans than older demographics, although among whites it is more supportive. At first blush, this is concerning. Republicans have said that as minorities move into the middle class they will start to vote more conservatively... could they actually have been right? Are these pampered young middle class kids rebelling against their parents? Or is this a problem with missing cell phones (PPP does not call cell phones)? Is it that Young Republican minorities are much more likely to have a landline? Or something else?
We'll focus on African-Americans because the numbers are more extreme and therefore easier to work with. First, we will eliminate the possibility that these strange numbers are actually correct by simply looking at the generic ballot numbers instead and comparing them to the exit polls for 2010 and 2008:
It takes all of about two seconds to see that the aggregated Daily Kos polling numbers for younger African-Americans are, shall we say, slightly misleading. There is simply no plausible way that Democrats are currently underperforming 2010 results among some of the most Democratic demographic groups in the country.
But the error bars are so small! How could the numbers be so far off? What we see here is a great example of polling error that is not included in the mathematical margin of error calculation. Follow me below to see what this error is.
To delve into this issue further, I split the African-American approval numbers by region as well as age. The regions I will use I defined (some better than others) in a post in 2011. For this discussion, it doesn't really matter what they are, except that several of them were drawn with the purpose of including high concentrations of African-American voters.
Below is a graph of African-American approval of Obama by age as a function of racial demographics, with each data point representing an age group in a different region:
Yesterday we saw data consistent with a fair number of respondents entering the incorrect race, many of them (likely) on purpose. Here, when we break down numbers that we
know are incorrect from comparing to exit polls, we see that the 'incorrectness' of the numbers increases as the ratio of white to Black respondents in a region increases. This is also consistent with a near-constant proportion of respondents incorrectly answering the race question.
In theory, the points on this graph should all fall in a straight line, and the slope of the line should tell us what percent of whites in each age group are reporting themselves to be black. (This assumes uniform opinions among African-Americans nationwide, and that all respondents are telling the truth about their age, and a few other things as well.) The graph shows a slope that is twice as steep for the young as the old. This leads us to another conclusion: either young people are twice as likely to fib on their race as old people, or one of our assumptions was wrong. And, indeed, we already know that people are about twice as likely to press 1 as 3 when they press the wrong number for geography; it is not a stretch to believe the same thing would happen with the age question. This isn't the only possible explanation, but together with the evidence we've seen on the race and geography question, I would personally put this explanation into the category of 'likely.' Any way you look at it, however, the numbers for young people will be incorrect.
So we can basically conclude that a small proportion of people may be incorrectly entering their age in PPP polls, just as we saw with race and geography, messing up numbers for young people and minorities. This is likely true for all automated polling and probably to some extent live-calling polling as well, depending on the poll design.
Again, as I did yesterday, I would like to remind the reader that PPP still has accurate topline numbers, and we wouldn't be able to delve into the numbers so deeply like this to explain the funky numbers we see if it weren't for PPP's transparent release of their raw data.
___________
Beyond the Margin of Error is a series exploring problems in polling other than random error, which is the only type of error the margin of error deals with.
Previously:
This Is Why We Can't Have Nice Things. A small number of respondents press the wrong button when answering the DailyKos poll question on race, leading to inaccurate numbers for racial minorities in the crosstabs.
Why Don't People Know Where They Live in the DKos Poll? A small number of respondents - around 5-9% - press the wrong button when answering the geography question on the Daily Kos poll. This is far greater than than can be explained by observed rates of misunderstandings or data entry errors.
Why State Polls Look More Favorable For Obama than National Polls. In the spring and summer, lack of support in Blue States was bringing down Obama's performance in national polls, while Swing States and Red States were polling about the same as 2008.
Presidential Polls Are Almost Always Right, Even When They're Wrong. How the presidential polls in red and blue states are off, sometimes way off, and how to predict how far off they'll be.
When Polls Fail, or Why Elizabeth Warren Will Dash GOP Hopes. Why polls for close races for Governor and Senate are sometimes way off, and how to predict how far off they will be.