538's polls policy and FAQs

How we decide which surveys to include in our database and models.

November 27, 2023, 5:58 PM

At 538, we strive to accumulate and analyze polling data in a way that is honest, informed, comprehensive and accurate. While we do occasionally commission polls, most of our understanding of American public opinion comes from aggregating polling data conducted by other firms and organizations. This data forms the foundation of our polling averages, election forecasts and much of our political coverage.

Because there are many different ways to conduct a poll, we are intentionally permissive when deciding which surveys to include in our database and models. As a general rule, we aim to capture all publicly available polls that are being conducted in good faith. While some polls have a more established track record of accuracy than others, we exclude polls from our dataset only in exceptional circumstances.

However, we do have fundamental standards that we expect pollsters to uphold. These standards are twofold: First, we have a set of methodological standards aimed at ensuring that we can verify polls included in our forecasts and models are based on sound survey methods. Second, we have a set of ethical standards aimed at ensuring pollsters are honestly engaged in the pursuit of truth and knowledge.

Fundamentally, both of these sets of standards are in service of the same goal: painting an informative and accurate picture of public opinion about politics and elections. To do so, the data we use must be both trustworthy and scientific. Our standards, set out in detail below, are the rules we use to ensure that is the case.

Methodological standards

Our methodological standards are intended to ensure that a pollster is conducting research in accordance with science. For a primer on the scientific principles undergirding public opinion research, we recommend this detailed overview from the Pew Research Center.

Simply put, we will consider aggregating any poll that has been made publicly available and meets a few basic standards of disclosure:

  1. The poll must include the name of the pollster, survey dates and details about the population sampled. If these are not included in the poll’s release, we must be able to obtain them in order to include the poll.
  2. Pollsters must also be able to answer basic questions about their methodology, including but not limited to the polling mode used (e.g., telephone calls, text messages, online panels), the source of their sample, their weighting criteria and the source of the poll’s funding. In most cases, a detailed written methodology statement is sufficient to satisfy this criterion, but we may contact pollsters directly to clarify methodological details and follow up occasionally to ensure the pollster is still meeting our standards.
  3. Any questions we include on our polls page or in our models must have a topline sample size of at least 100.

There are some types of methodologies we don’t include, such as:

  1. Nonscientific polls that don’t attempt to survey a representative sample of the population or electorate. This includes surveys that do not attempt to weight or adjust their sample to represent the desired population. (Using quotas to recruit a representative sample is fine.)
  2. Polls that blend or smooth their data over time using methods like MRP (short for “multilevel regression with poststratification”) or polls that use MRP to estimate results in geographies or populations they did not sample directly. While this is a valid technique for understanding public opinion data, we exclude these polls because we consider them more like models than individual polls. (As an analogy, we think of this as using someone else’s barbecue sauce as an ingredient in your own barbecue sauce.) However, we will include polls that use model-based techniques like MRP as a substitute for traditional weighting methods to adjust individual samples of opinion, rather than smoothing the data over time.
  3. Do-it-yourself polls commissioned by nonprofessional hobbyists on online platforms like Google Surveys or SurveyMonkey. (Professional or campaign polls using these platforms as a source for raw data are fine.)

  4. Subsamples of multistate polls are not treated as individual polls of those states unless there is some method employed to verify the geographic location of the respondents and each state in the poll is weighted individually.
  5. Questions that reveal leading information about the candidates before asking voters whom they support, sometimes called an “informed ballot” question. If, for instance, a poll says “President Joe Biden loves puppies. Whom do you plan to support: Biden or former President Donald Trump?”, we won’t include it.
  6. Questions that include generic candidates (other than the generic congressional ballot). For example, if a poll asked, “Whom do you plan to support: Biden, Trump, or a moderate independent candidate?”, we won’t include it.

General election polls that include hypothetical candidates — for example, a poll testing a hypothetical three-way presidential race between Trump, Biden and former Rep. Liz Cheney — are included on our polls page but are not included in our models unless and until every politician asked about has declared their intention to seek the office in question and has gotten their name on the ballot or is engaged in a serious, well-funded write-in campaign. Write-in candidates are included on our polls page if a poll asks about them directly or if at least 10 percent of respondents to a poll volunteered them.

Polls of presidential primaries that include hypothetical candidates are also included on our polls page but may or may not be included in our models. (For example, our polling averages exclude polls of primary matchups that have already been ruled an impossibility. See our methodology for full details.)

Ethical standards

In addition to these methodological standards, we expect pollsters to adhere to several basic ethical standards. We’ve derived these standards in part from those developed by the American Association for Public Opinion Research in its Code of Professional Ethics and Practices. In particular, we expect the following:

  1. Pollsters will not falsify or fabricate data.
  2. Pollsters will not engage in betting markets that may be directly impacted by their survey work.
  3. Pollsters will not knowingly select research tools or methods of analysis that yield misleading conclusions.

  4. Pollsters will disclose to the public the methods and procedures used to obtain publicly disseminated research results. When such disclosures are insufficient, pollsters will respond to queries seeking further detail.
  5. Pollsters will not knowingly make interpretations of research results that are inconsistent with the data available, nor sanction such interpretations made by others. Pollsters will ensure that any findings they report are an accurate portrayal of their results. Pollsters will not knowingly imply that interpretations are accorded greater confidence than the data warrants.

  6. Pollsters will not misrepresent the purpose of their surveys or conduct other activities (such as sales, fundraising or true “push polling” for political campaigns) under the guise of conducting research. Polls that are conducted and released on behalf of political clients (candidates, PACs, lobbying organizations or other such clients) will be clearly labeled as such. Pollsters will make no false or misleading claims as to a poll’s sponsorship or purpose.
  7. Pollsters will correct any errors in their work that come to their attention and that could influence interpretation of the results. If factual misrepresentations or distortions have been made publicly, pollsters will correct them in a public forum that is as similar as possible to the original data dissemination.

Consequences

When a new pollster releases data for the first time, we conduct an initial screening to ensure it meets our methodological and ethical standards. If this screening is insufficient to determine whether the pollster meets these standards, we will follow up with a more thorough review that can include multiple avenues of investigation, such as phone conversations, requests for additional methodological information or detailed data such as crosstabs or raw individual-level data, or other queries that can convince us of the pollster’s scientific and ethical integrity. We may also conduct such a review of an established pollster if one of its polls appears methodologically or ethically dubious or if we receive credible allegations of misconduct about the pollster.

Any pollster found to be violating one of our first two ethical standards — falsifying data or engaging in betting markets directly impacted by its survey work — will be permanently excluded from our aggregation. This is irreversible. These actions raise serious questions about the intent of the pollster’s work and irreparably break our trust in its future polling.

If there are at least two suspected violations of ethical standards No. 3 through No. 7, we will inform the pollster of our concerns about its behavior and give it at least three opportunities to either show that the behavior has been corrected or argue that the behavior is not problematic. If we do not hear from the pollster after three attempts to reach it, at least one by phone and at least one by email, or if its response is insufficient to address our concerns, we will exclude future polling by that firm from our polls page, model runs and articles. (Polls conducted by the pollster prior to the date of exclusion will remain on our polls page, and old model runs will be unaffected.) The pollster can be reinstated at any point if it can demonstrate that it meets the above criteria and that any unethical behavior has been corrected. We expect these situations to occur extremely rarely; if more than two pollsters are removed under this section of the policy in a given calendar year, we will reevaluate our procedures and make changes as needed.

Alternatively, for a pollster that at least twice fails to disclose the sponsors of its surveys (a violation of ethical standard No. 6), we may decide to treat all of the pollster’s polls as partisan rather than excluding them from our data entirely. This impacts how the polls appear on our page and how they are treated by our models.

Partisan surveys

While we know that polls are often paid for by entities with a vested interest in the outcomes of elections, such polls can provide useful data when properly adjusted, so long as pollsters and sponsors are not misrepresenting the outcome of their research or using scientifically unsound methods. As long as they meet the standards outlined above, we include these internal and partisan polls in our database and models, except in one unusual circumstance (a general election poll sponsored by a candidate’s rival in the primary*).

Polls are considered “partisan” if they’re conducted on behalf of any organization that conducts a large majority of its political activity on behalf of one political party or candidate. Typically, a partisan organization is a PAC, super PAC, hybrid PAC, 501(c)(4), 501(c)(5) or 501(c)(6). However, we may consider organizations of other types to be partisan if their political spending indicates a large majority of activity is conducted on behalf of a particular party. We will also count all of a pollster’s polls as partisan if it is formally affiliated with a partisan organization; for instance, the Democratic Congressional Campaign Committee has an in-house polling operation that is always considered partisan. Polls sponsored by partisan organizations are noted on the polls page by a solid diamond next to the sponsor’s name. Polls conducted by partisan pollsters are noted by a solid diamond next to the pollster’s name.

Additionally, if we find that a sponsor organization is selectively releasing polls favorable to a certain candidate or party, we may also categorize that organization as partisan. We generally go out of our way to not characterize news organizations as partisan, even if they have a liberal or conservative view. But selectively releasing data that favors one party is a partisan action, and those polls will be treated as such. These classifications may be revisited if a sponsor ceases engaging in this behavior.

“Internal” polls are a narrower category of partisan surveys that are conducted on behalf of a political party, campaign committee or other official party apparatus. Internal polls are noted on the polls page with a hollow diamond next to the sponsor’s name.

Below are some questions we’ve been asked over the years about the types of polls we collect. Take a look, and if you still have questions or find a poll we don’t have, please email us at polls@fivethirtyeight.com.

Frequently asked questions

Q: Which races do you collect polls for?
A: We collect horse-race polls for presidential, Senate, gubernatorial and House general elections (including polls of the generic congressional ballot). We also collect job approval and/or favorability polls for various politicians and governmental institutions. At this time, we do not collect primary polls other than for the presidency — except in “jungle primaries” when it’s possible for a candidate to win the seat outright.

Q: Why don’t you have any polls of the race I’m interested in?
A: The latest polls page includes all polls publicly released within two years of a given election, beginning with the 2018 election cycle. If we don’t have any polls for a particular race, that means we aren’t aware of any polls for that race.

Q: Why isn’t a particular candidate included in your polling averages?
A: Candidates appear in general election polling averages if they have received at least 5 percent support in at least five different polls from at least three different pollsters, have been asked about in at least eight polls total, and have had their name included on the ballot or are running a serious, well-funded write-in campaign. For presidential general elections, being included on the ballot in at least one state is sufficient for inclusion in our national polling average, provided the other conditions are also met.

Polling averages for presidential primaries include every candidate who qualifies as “major” according to our criteria and has been included in at least eight polls from at least three different pollsters.

Q: Can I download this data?
A: Yes! There is a dropdown menu containing all our polling datasets at the bottom of the latest polls page, but you can also download this data and more from our data repository, which includes all our polls, forecasts and other data projects. Unfortunately, under the terms of our data-sharing agreements, we are not able to share data on presidents’ approval ratings before Trump. You can find additional information on historical presidential approval ratings, and guidelines for acquiring that dataset, on the Roper Center’s website.

Q: How do you account for a pollster that publishes multiple results for a question?
A: When a pollster publishes multiple populations (for example, all adults, only registered voters and only likely voters), we include all of them in our database. Similarly, if a pollster asks a horse-race question with different sets of candidates (for example, with and without a third-party candidate), we include all versions of this question in our database. If the pollster includes multiple likely voter models, as in this Monmouth poll, we first check if the pollster indicates that one of the versions is its preferred option. If it does, we use that version; otherwise, we include them all in our database. Our election models take these multiple versions into account, preferring likely voters over registered voters and registered voters over all adults. The model also averages results in the case of multiple topline questions, such as when a poll asks both a head-to-head matchup and a matchup including third-party candidates or has more than one likely voter model. Our approval and favorability models work similarly, though these models prefer questions among all adults over those among registered voters and questions among registered voters over those among likely voters.

Q: How do you account for a pollster that publishes numbers with and without “leaners”?
A: When a pollster publishes one version of a question with “leaners” — respondents who may be uncertain about their vote but say that they lean toward a particular candidate or party — and one without, we include only the version with leaners. If the question including leaners is a “forced-choice” question, in which respondents are not given an option of saying they are undecided when asked which way they lean, we still include only that version of the question instead of the version without leaners.

Q: Do you weight or adjust polls?
A: Yes. When we calculate our polling averages, some polls get more weight than others. For example, polls that survey more people or were conducted more recently get more consideration in calculating our polling averages than polls with small sample sizes or older polls. Our polling averages also apply adjustments for things like how consistently a pollster leans toward one party or candidate. For more information on how we calculate polling averages, see this detailed methodology.

Q: Why do some races not have a polling average?
A: In order to ensure a polling average has enough data to accurately represent the state of the race, we do not publish an average for a race until the major party candidates have been officially selected in primaries and we have collected at least eight different polls from at least three different pollsters for that race. In addition, we may not have polling averages for races that use an instant runoff, as the candidates included in the questions may differ from pollster to pollster.

Q: Why are the sample sizes sometimes missing for polls?
A: If a poll does not have a sample size listed, the pollster or sponsor did not report it and we are actively working to obtain it. These polls are still included in our averages and models with an imputed sample size based on the sample sizes we typically expect from that pollster for the type of race being polled. Once we obtain the actual sample size, we will update the data and any relevant models. (We recalculate model estimates only for the day the change was made, not the past.)

Q: Why do the values in some polls add up to more than 100 percent?
A: Values in some polls may appear to add up to more than 100 percent due to rounding. For example, if a pollster published a poll that gave the president an approval rating of 46.5 percent and a disapproval rating of 53.5 percent, then those numbers would round up to 47 percent and 54 percent, respectively.

Q: Why do the margins in some polls not match what the pollster reports?
A: This often boils down to rounding. For example, if a pollster puts one candidate at 45.2 percent and another at 45.6 percent, we’ll display these two candidates at 45 percent and 46 percent, respectively, since we round to the nearest integer. However, the actual margin in the poll is 0.4 percent, which will display on our site as a difference of 0.

This may also occur if the pollster publicizes a version of the question that differs from the version that 538 includes on our polls page. For example, a pollster may publicize a horse-race question that does not include leaners, but 538 publishes only the version that does include leaners.

*The reason for this is that we don’t have good priors for which direction the bias in such polls might run. Typically, for instance, a Democratic internal poll would have Democrats faring well in a general election matchup. But in a competitive primary, it’s plausible that a Democrat might want to draw attention to numbers that made a rival Democrat’s general election chances look worse.

Related Topics