How did we do? A look back at Daily Kos Elections' 2018 race ratings

by David Jarman for Daily Kos Elections

Monday, Dec. 31, 2018 Monday, Dec. 31, 2018 at 11:49:14am PST

Elections are never truly over; pundits start talking about the next one as soon as the last one is in the books, before the lame duck session is even ended, and that’s especially the case this year, in that we’re most likely headed for a full re-do of the fraud-tainted election in North Carolina’s 9th congressional district. But, to the extent that the votes are finally counted in the vote-by-mail western states, we’re done enough that we can do a quick retrospective look at how Daily Kos Elections’ race ratings performed this year.

As you might know, we had two entirely different approaches to rating the races, which provide two different perspectives. There are our qualitative ratings (Senate, gubernatorial, and House), which use the Tossup/Lean/Likely/Safe formulation that a number of other prognosticators use; these are based on polling data, of course, but also a gestalt mixture of other factors, like the state of the overall national environment; candidate fundraising; what races the major outside groups like the DCCC or House Majority PAC are spending on; what rumors about the candidates’ chances are getting leaked to the press; and other intangibles, such as whether a candidate’s messaging or ads sound confident or defensive.

There is also our quantitative system, which uses Bayesian trendlines to average polls in the races. We chose not to do a full-on predictive model this year, assigning specific probabilities to each race or to a cumulative event like flipping control of the Senate (which means that, unlike 2014, we can’t say “Wow, we had the best ‘Brier score’ of any model”). By comparing our quantitative averages to the actual results, however, we have the chance to delve a little more deeply into how accurate both our quantitative and our qualitative ratings were. (Also, even among various analysts, such as FiveThirtyEight or RealClearPolitics, you’ll see slightly different averages, so that can be a basis for comparison too. Averages can vary depending on which pollsters get included, whether bad pollsters [or internal polls] get downweighted, and the rate at which older polls’ influence decays.)

Part of the appeal of doing qualitative ratings is that it’s hard to be truly “wrong”; you can, however, strive to have as few races as possible in the Tossup category, which basically means that either candidate has a good chance of winning, but many races just simply belong there. The main way to be truly “wrong” in the qualitative scheme would be to say that one party is going to win a particular race (for instance, by putting it in the Lean Democratic or Likely Democratic column) and then having it end up with the other party winning it. (Theoretically, you could have some egg on your face by putting a race in the Tossup column that one party wound up winning by, say, 20 points; overusing Tossup to avoid making hard choices kind of defeats the whole purpose of doing qualitative ratings, though.)

For instance, in our Senate ratings, we didn’t have any Lean Republican races that were won by a Democrat (that column was ND-Sen, TN-Sen, and TX-Sen, all of which were GOP wins), and we didn’t have any Lean Democratic races that were won by a Republican (that column was limited to NJ-Sen and WV-Sen). The six races in the Tossup column were split, with three going to the Democrats (Arizona, Montana, and Nevada) and three going to the Republicans (Florida, Indiana, and Missouri). So, by our standards, we didn’t get anything “wrong” in the Senate.

And we got a similar positive result in our governors ratings; all the Lean Democratic races were Democratic holds (CO-Gov, CT-Gov, NM-Gov, OR-Gov, and RI-Gov) and all the “Lean Republican” races were Republican holds (AK-Gov, OK-Gov, and SD-Gov). And again we split the Tossups down the middle: The Democrats won four (KS-Gov, ME-Gov, NV-Gov, and WI-Gov), while the GOP won four (FL-Gov, GA-Gov, IA-Gov, and OH-Gov).

In our House ratings, there were a total of four races that we got “wrong”; we’re of course pleased with that, since there are way more House races (we rated nearly 100 of them as competitive, so four misfires is, historically, pretty good), most of which see little polling information when compared with statewide races. A few misses like that are inevitable, especially in a wave election where a few sleeper races always appear at the very end. (Also, we’re pleased with it because the unexpected results were all Democratic wins.)

The Democratic candidates picked up two races that we rated Lean Republican, NY-11 and SC-01, and two that we rated Likely Republican, CA-21 and OK-05. The decision to have the 21st as Likely may seem odd on paper, as, coming into November, it was one of the bluest (at the presidential level) House seats still held by a Republican, but it’s one that’s been a constant source of disappointment all decade. Polling had showed Rep. David Valadao easily winning against Democratic challenger T.J. Cox, who also wasn’t posting the same gangbusters fundraising numbers as other Democrats.

So all the evidence was there that, once again, it was fool’s gold—but this time, somehow, there was adequate Latino turnout in this Central Valley seat to get Cox over the top. (If you’re wondering when the last time was that we whiffed on a race that was in the Safe category, it hasn’t happened since 2008, when we were still Swing State Project; we didn’t see Bill Jefferson’s loss to Joe Cao coming in dark-blue LA-02 in New Orleans, which required a perfect storm of a corruption scandal, a December runoff, and a literal storm, in the form of Hurricane Gustav.)

Now let’s take a more in-depth look at the quantitative side, starting with the Senate:

STATE	predicted	margin	actual	margin	oops factor	qual. rating
MINNESOTA (SP.)	49-40	+9	53-42	+11	+2	Likely D
NEW JERSEY	51-39	+12	54-43	+11	-1	Lean D
WISCONSIN	52-42	+10	55-45	+10	0	Likely D
MICHIGAN	51-44	+7	52-46	+6	-1	Safe D
OHIO	50-39	+11	53-47	+6	-5	Likely D
NEVADA	46-45	+1	50-45	+5	+4	Tossup
WEST VIRGINIA	46-41	+5	50-46	+4	-1	Lean D
MONTANA	47-46	+1	50-47	+3	+2	Tossup
ARIZONA	46-47	-1	50-48	+2	+3	Tossup
FLORIDA	49-46	+3	50-50	-0.12	-3	Tossup
TEXAS	47-48	-1	48-51	-3	-2	Lean R
MISSOURI	46-47	-1	46-51	-5	-4	Tossup
INDIANA	44-42	+2	45-51	-6	-8	Tossup
MISSISSIPPI (SP.)	36-49	-13	46-54	-8	+5	Likely R
NORTH DAKOTA	41-51	-10	44-55	-11	-1	Lean R
TENNESSEE	44-49	-5	44-55	-11	-6	Lean R

As you can see, the Senate averages performed pretty well: Three races were “wrong” in terms of the predicted topline. In other words, the averages pointed to Republicans winning in Arizona and Democrats winning in Florida and Indiana, which didn’t happen; on the other hand, the average error (what I’m calling the “oops factor” in the second-to-last column: the difference between the predicted margin and the actual margin) on Arizona and Florida was pretty small. The error on Indiana, however, was the largest of any race we looked at. (Of course, our averages are only as good as the polls that we put into it.)

Several other races saw a notable error, though one didn’t affect the topline result. Mike Espy (who lost in Mississippi) and Jacky Rosen (who won in Nevada) significantly overperformed their polls, while Sherrod Brown (who won in Ohio) and Phil Bredesen (who lost in Tennessee) underperformed the previous polls. There was also one other race where the polling was very accurate, but we may have misclassified the race in our qualitative scheme: Michigan, where Republican challenger John James closed the gap late in the race, which the last few polls reflected, but we didn’t move the race to, say, Likely Democratic instead of Safe Democratic. (Debbie Stabenow still won, of course, so it’s not a big deal.)

Now let’s do the same analysis on the gubernatorial side:

STATE	PREDICTED	MARGIN	ACTUAL	MARGIN	OOPS FACTOR	QUAL. RATING
RHODE ISLAND	44-33	+11	53-37	+16	+5	Lean D
ILLINOIS	48-32	+16	54-39	+15	-1	Likely D
NEW MEXICO	52-42	+10	57-43	+14	+4	Lean D
MINNESOTA	50-40	+10	54-42	+12	+2	Likely D
COLORADO	48-40	+8	53-43	+10	+2	Lean D
MICHIGAN	49-41	+8	53-44	+9	+1	Likely D
MAINE	51-38	+13	51-43	+8	-5	Tossup
OREGON	45-41	+4	50-43	+7	+3	Lean D
KANSAS	42-41	+1	48-43	+5	+4	Tossup
NEVADA	45-45	0	49-45	+4	+4	Tossup
CONNECTICUT	44-39	+5	49-46	+3	-2	Lean D
WISCONSIN	48-46	+2	50-48	+2	0	Tossup
FLORIDA	49-45	+4	49-50	-1	-5	Tossup
GEORGIA	44-50	-6	49-50	-1	+5	Tossup
IOWA	46-44	+2	48-50	-2	-4	Tossup
SOUTH DAKOTA	48-45	+3	48-50	-2	-5	Lean R
OHIO	45-42	+3	47-50	-3	-6	Tossup
ALASKA	44-47	-3	44-51	-7	-4	Lean R
NEW HAMPSHIRE	45-46	-1	46-53	-7	-6	Lean R
SOUTH CAROLINA	36-51	-15	46-54	-8	+7	Safe R
MARYLAND	35-51	-16	44-55	-11	+5	Likely R
OKLAHOMA	43-47	-4	42-54	-12	-8	Lean R
TEXAS	39-53	-14	43-56	-13	+1	Safe R
ARIZONA	38-55	-17	42-56	-14	+3	Safe R
VERMONT	37-48	-11	40-55	-15	-4	Likely R

The errors are a little larger in the gubernatorial races, which may be expected, since many of these races saw only a handful of polls (the principle is that more polls from more pollsters lowers the cumulative margin of error). Oddly, though, one of the races with a fairly large “oops factor” was the Florida gubernatorial race, probably the most heavily polled race in the country. Florida was also one of only four races where the error affected the topline result: Our averages would have pointed to the Democratic candidates winning in Florida, Iowa, Ohio, and South Dakota, which didn’t happen. (The Democrats still flipped a total of seven gubernatorial seats, which is still a historically large number.)

There were three Safe Republican races where the final result was within the bounds set by the “competitive” race with the largest margin, which was Vermont: Arizona, South Carolina, and Texas. (Two of those, Arizona and South Carolina, were in the Lean or Likely columns for most of the cycle; they eventually fell off at the end, after polling suggested those races looked pretty hopeless. South Carolina’s race, hardly on anyone’s radar, wound up being within single digits in the end.)

Somewhat amazingly, our polling averages wound up being exactly right in both the Senate and the gubernatorial races in Wisconsin; Tammy Baldwin and Tony Evers wound up winning by 10 and 2, respectively, just like we said. Of course, we can’t take much credit for that; it has mostly to do with Wisconsin’s pollsters (especially Charles Franklin of the Marquette Law poll), who definitely got back on the right track in 2018 after the notorious whiff on Wisconsin polling in 2016. You’ll also notice that Michigan polling was spot-on this year, though Ohio polls continued to underestimate Republican strength there.

Finally, there’s the matter of our House predictions. There are way too many competitive House races to present in tabular form, but if you look back at our final pre-election post, our quantitative averages pointed to Democrats winning a net gain of 30 seats. That’s going purely race-by-race, based on the topline results from each race’s poll averages, which, as I mentioned before, isn’t very helpful, because many of those races had only one or two polls total over the cycle, many of which were leaked internals rather than nonpartisan media polls.

That poll-averaging method, more so than in the Senate or gubernatorial races, tends to miss late-breaking races (like the handful of Lean R or Likely R races that we won that sneaked across the line at the end). So it shouldn't be too surprising that the Democrats won quite a bit more than that, with a net gain currently standing at 40 seats.