How 538's 2024 House election forecast works
Here's everything that goes into this year's model.
On Tuesday, 538 released its 2024 election forecast for the House of Representatives. The general idea behind our forecast is to combine polling data (say, on which party Americans want to control Congress) with a bunch of other quantitative and qualitative data to figure out which candidates are likely to win in each congressional district. Then we calculate how much uncertainty we have about the outcome, how that uncertainty is shared across seats, states, regions and the country as a whole, and produce a forecast of overall chamber probabilities for each party. This article explains all the different steps we take to come up with that final probability.
1. The polls
In modern elections, how people vote in congressional elections has become increasingly correlated with their presidential vote — to the point where the most important thing to get right is the national environment. National polls of the generic congressional ballot — i.e., polls that ask people what party they intend to support for Congress without asking them about specific candidates — are therefore a very important factor in our forecasting model.
Accordingly, our House forecast starts by calculating an average of generic ballot polls. To do so, we take all qualifying polls, reweight them according to the pollster rating of the responsible firm, adjust results for the way a poll was conducted, and then calculate the likeliest trendline running through the polls. The generic ballot polling average we use for this forecast is actually slightly different from the one we publish on our polls page, which calculates support for each party by combining the results of a weighted average and polynomial regression run on the polling data, with additional weights for poll recency and sample size. The forecast's average is similar to the one we use to average presidential general election polls.
In addition to the generic ballot polling average, our forecast also incorporates polls of each House race, where available. In most districts, we do not have enough polling data to run a fancier aggregation model of polls over time (like we do for the presidency or Senate), so instead we calculate a simple weighted average of all the polls released for the district. This average gives more weight to polls conducted more recently, more weight to polls conducted by higher-rated pollsters, and less weight to polls from partisan firms or sponsored by partisan clients. We adjust partisan polls by subtracting 2 percentage points from the affiliated party and adding it to the opposition.
2. The fundamentals
Importantly, we also infer what polls would say in districts without polling. We do this by running a regression that predicts the polling averages in each House seat using a variety of non-polling factors we call the "fundamentals."
The first, and most important, fundamental is simply the partisan lean of the district. 538 calculates a district's partisan lean by taking a weighted blend of how much redder or bluer the district voted than the country as a whole in the most recent and second-most recent presidential election.
The exact weight that the most recent presidential result gets relative to the second-most recent is decided by our model based on what has best predicted districts' future partisanship in the past. For the 2024 election, partisan lean is about four parts 2020 results and one part 2016 results. This means a district that voted by 9.5 points for President Joe Biden in 2020 (thus being 5.0 points bluer than the overall nation, which voted for Biden by 4.5 points) and by 3.1 points for former Secretary of State Hillary Clinton in 2016 (thus being 1.0 point bluer than the overall nation, which voted for Clinton by 2.1 points) would have a partisan lean of D+4.2 (the average of 5.0 and 1.0 if 5.0 gets four times as much weight as 1.0). This number represents the expected vote margin in this seat in a completely neutral year — if the congressional candidates perform exactly the same as the presidential candidates over the last two cycles.
Of course, congressional candidates are not presidential candidates, so there are a few other district-specific factors to take into account:
All these variables enter a regression model trained to predict the two-party vote in each contested House election since 1998. That model is fit using a statistical technique called Markov chain Monte Carlo, which we run using the programming language Stan. Similar to the way that calculators for high school algebra work, Markov chain Monte Carlo takes in a mathematical equation from the user and figures out the values of each variable that maximize the predictive accuracy of the model. More details on this technique are available in the methodology for our presidential election model.
This method works great for predicting support for Republicans and Democrats in races with only two candidates — one from each major party. But what about candidates of other parties? We predict support for third-party candidates in the second stage of our House model. This stage trains a simpler regression model to predict historical third-party vote share using two variables: the number of third-party candidates running in the seat, and, where available, the average support for third-party candidates in the polls. We get the Democratic and Republican vote shares by multiplying their two-way vote shares from the first stage of the model to all the votes left over.
We also run similar regression models to predict support for individual candidates in races in which all candidates are from the same party, which is common in California, Louisiana and Washington because of their top-two primary systems. In these cases, we predict support for all candidates of one party separately, using a model trained to predict the historical number of votes cast for a candidate divided by all the votes cast for their party in the respective race. That model takes the number of candidates running, campaign finance and polls into account. We add these back into our overall model by multiplying the predicted two-party share of the vote for each candidate's party by the candidate's predicted share of their party's votes. For example, if we predict the two-party Republican vote in a seat to be 40 percent and predict a certain Republican candidate to win 50 percent of their party's votes, the final predicted vote share for that candidate would be 20 percent.
In Louisiana's jungle primaries, if no candidate wins a majority of the vote (which is required to avoid a runoff), we simulate the runoff using either the original predicted two-party vote for the seat (when there is both a Democratic and Republican candidate) or the predicted share of the partisan vote going to each candidate divided by the total between them (when both candidates are from the same party).
3. Qualitative race ratings
Empirically, combining the polls and fundamentals gives us a very historically accurate model for predicting congressional elections. On average, when looking at the 2006 through 2022 elections, a version of our forecast that uses just polls and fundamentals predicts the wrong winner in contested House elections just 4 percent of the time — equivalent to 17 districts in the average year. In terms of our median forecasted number of seats for each party (e.g., "223 seats for Democrats" or some other hypothetical number), we are off on average by just five seats.
However, there are limits to what we can measure about House elections using data and hard math. Sometimes districts jump to the left or right due to local, more qualitative factors that are missed in polls and not captured in surveys of similar districts. For example, candidates can be inexperienced or a bad fit for a district in ways not captured by our quantitative metrics on experience and incumbency.
In these cases it makes sense to use qualitative race ratings to fill in the gaps. These ratings come from experts over at Sabato's Crystal Ball, the Cook Political Report and Inside Elections, where analysts spend time interviewing candidates, talking to constituents and crunching their own demographic and electoral data to make predictions about elections. You will usually see these ratings spelled out in a qualitative manner, such as "Lean Republican" or "Likely Democrat." We take those categorical ratings, convert them into numbers and input them into our models as their own predictors in our big regressions.
We convert experts' categorical ratings into numbers by taking their historical ratings and calculating the average margins and standard deviation of those margins for all the candidates within each category. For example, in expert ratings from 2000 through 2022, we find that when a party's candidate was given a rating of "Tilt" or "Lean" in their favor, they usually beat their opponent by 6 points (with a standard deviation of 9 points). In "Likely" seats, the party usually won by 11 points (with a standard deviation of 10 points), and in "Safe" or "Solid" seats, 33 points (with a standard deviation of 17 points). The predicted margin for a candidate in a Toss-up seat is +0, with a standard deviation of 9 points.
We use this converted categorical rating as a predictor in our overall regression model, letting the forecast decide how much weight to put on experts’ conventional wisdom based on how much additional value they have provided historically above and beyond the polls plus fundamentals. We find that, on average from 2000 through 2022, a race rating should have received about 36 percent of the overall weight in a congressional district prediction, with the remaining 64 percent going to a combination of the district-level polls (12 percent) and fundamentals plus generic ballot (52 percent).
The final weights of our model depend on how many polls are available for each seat. In competitive districts with no polls, our model puts about 42 percent of the final weight of the cumulative prediction on the race rating, about 56 percent on the fundamentals plus generic ballot polls, and the remaining 2 percent on an imputed polling average based on polls in other districts. As the number of polls for a seat increases, weight on the polls increases and weight on the ratings and fundamentals decreases. If there are a dozen polls for a seat, we would put about 9 percent of the final weight on the race rating, 20 percent on fundamentals and 71 percent on the polling average for that seat. In races with few polls, the polling average is reverted back toward the imputed average there based on polls in other districts.
4. Simulating uncertainty
We built our model on a history-backed theory of how voters behave in congressional elections and have paid special attention to how much we should weigh this theory against polling data for the current election. But our model will not be perfect in every seat — and especially in years when polls are off, we can have some big misses. The value add of our forecast is in correctly quantifying that uncertainty so we can give the public a good idea of the true range of outcomes for a given contest.
Our model looks at prediction error as coming from five overall sources. First, there is national error, which is shared across all seats. This error accounts for things like the generic ballot polls being off or the predictive value of presidential versus district results changing dramatically from cycle to cycle. Second, there is regional error, which can affect all the districts belonging to one region. It's possible that there will be a shift toward Democrats just in New England, for example, or toward Republicans in the Southwest. We group states according to the regional definitions established in 538's presidential election forecast.
Third, there is error in seats that are similar demographically but that are not necessarily geographically or politically close to one another. If polls underestimate support for Republican House candidates among Black voters, for example, that error will be spread across seats with lots of Black voters in the Northeast, South, Midwest, etc. So we additionally split the country up into five demographic "regions" based on the share of each seat that is Black, Hispanic, Asian or holds a college degree; the median income of the district; the median age of the district; and the percentage of the district that is urban, per the five-year American Community Survey (or decennial census if looking at cycles before 2010). Where possible, these variables are gathered for the year of the election we are forecasting — e.g., census data for Alaska's at-large House seat in 2022 was taken from American Community Survey data published in 2022. Where that's not possible, we impute these metrics first from other years' census data for the same seat or, as a last resort, using multiple imputation on the full range of congressional elections in our data. We also estimated 2024 demographic data for states with redistricting by using tract-level data from the 2022 American Community Survey.
Fourth, there is state-specific error. This error is important since all the districts in one state can be influenced by the effects of state politics, shifting demographics or weird turnout patterns (see: New York in 2022). Fifth and finally, there is idiosyncratic error belonging to each district in isolation; this accounts for factors such as bad polls, candidate quality issues, local weather on Election Day and the like.
Added up, we expect there to be about 8 points of error on the margin between candidates in each seat. Error comes relatively equally from each source: There are about 3 points of error at the national level, 2 each at the regional and state levels, 2 at the demographic-cluster level, and 6 at the district level. (These numbers don't add up to 8 points because uncorrelated errors are not additive.) We also allow for district-level error to be higher in redistricting cycles and in seats further away from an even partisan lean. (Districts with lopsided margins tend to be harder to predict the exact margin of, even though seat control is not in question.) Moreover, we add a little bit of error to account for possible unforeseen issues in generic ballot polling, which helps us control for effects of things like partisan non-response or problems with weighting polls. This additional error amounts to about 2 points on the margin and affects all races simultaneously.
That's it! If you have any questions, or see anything missing from our methodology or forecast page, drop us a line and we'll get right on it!