All these calculations account only for sampling error, the only kind of imprecision that's readily quantifiable in probability-based samples. Survey research also is subject to non-quantifiable non-sampling error, including factors such as methodological rigor; non-random non-coverage of elements of the population under study; non-random non-response influencing who participates; the wording, order and response categories in questions; and the professionalism of interviewers and data producers. Of note, no margin of sampling error is calculable in non-random, non-probability samples, such as opt-in internet panels.
Update on design effect - 12/09
A further complication in sampling error, alluded to above, stems from a survey's design effect, a calculation that adjusts for effects such as clustering in area probability samples (exit polls, for example, or our face-to-face surveys in Iraq and Afghanistan); and weighting, relevant to random-digit-dialed (RDD) telephone surveys as well as to other forms of probability sampling.
In exit polls conducted for the National Election Pool, a media consortium including ABC News, the design effect of clustering and weighting alike is given as 2.25. As a result, a sample of 1,000 people in one of these exit polls has an error margin of +/-4.5 points (with a 50/50 split at the 95 percent confidence level), rather than the 3 points that would have been calculated without taking the design effect into account. (This is figured by multiplying the error margin based on sample size alone, in this case 3 points, by the square root of the design effect, in this case 1.5.
In RDD telephone samples, the design effect due to weighting in the past generally has been so slight as to be ignorable. That's changed recently as telephone sampling procedures have been altered to include cell-phone respondents; these procedures increase the theoretical margin of sampling error because additional weighting is needed to incorporate the cell and landline samples. (The situation also occurs when oversamples are used to increase the reliability of the estimate of a particular group. Again, while oversampling is done to improve estimates, the weighting required to adjust the sample back to true population norms increases the design effect in the full sample.)
At ABC we've tracked the design effect of each poll we've conducted since we started adding cell-only interviews in fall 2008; in the last six (with consistent cell-only sample sizes) it's averaged 1.42. Inclusion of this design effect is why we now report most ABC/Post polls of about 1,000 people as having a margin of sampling error of plus or minus 3.5 points, rather than the customary 3 points.
It's ironic that taking steps to improve the accuracy of a survey by enhancing coverage of its target population has the perverse effect of increasing its theoretical margin of sampling error; this is a reason that sampling error in and of itself is not a full measure of a survey's accuracy. It's also a reason to be cautious making comparisons across surveys. Some, less accurately, report a lower margin of sampling error because they don't take design effects into account. Others may have a lower theoretical error margin, but significant noncoverage -- an example of the nonsampling error described above.
In some ways this situation is similar to that involving response rates, which can be improved in ways that degrade sample coverage. (See details here.) Better response rates, for that reason, in and of themselves are not necessarily indicators of better data. Likewise, a lower theoretical sampling error does not necessarily indicate a better estimate, if for example it were obtained via a sample that failed to optimize coverage of the population under study.
With thanks for review and comment by Charles Franklin, Paul Lavrakas and Dan Merkle.