The Curious Case of Erroneous Polls
Updated: Mar 11, 2020
By Soumya Jeet
In the present world, opinion polls alongside elections are reckoned as virtual vehicles of democracy in action. Even before campaigning has reached beyond its warm up stage, polling agencies come up with neat statistical analyses on which candidates are leading whom. Something which started as a crude straw poll in Pennsylvania as early as 1824 has now become synonymous with the onset of elections in most advanced democracies, including ours. The statistical tools employed in Opinion Polls gained enough credibility after consecutive successful predictions about Harding, Coolidge, Hoover and finally Roosevelt as presidents of the United States of America, spanning the period from 1920 to 1932.
Sampling and the Margin of Error
Polls, like most other statistical phenomena, work on a well-defined sample and the extrapolation of the result to the population within appropriate confidence intervals. This quite obviously brings in the immediate possibility of sampling error. The margin of error in a poll is a measurement of how accurately the results of the poll reflect the traits of the whole population. In simpler words, it is a statistical formulation of how well the sample represents the general population.
To make things simpler, let us consider an example closer home. Think about the 2014 Lok Sabha elections. A hypothetical polling agency named Indopolls has predicted that Modi is leading with 51%, which means that, of the people polled, 51% prefer Modi to Rahul Gandhi. With this result comes a margin of error, which in this case, let us assume, is 2%. This means that if a different sample was to be polled, the fraction preferring Modi could vary from 49% to 53%. The Margin of error is calculated so that the probability that the new poll’s result will lie in that 4% range is 0.95 or 95%. Elucidating on that, if the margin of error is 2% and a thousand polls are conducted throughout the country, the results suggest that 950 (95% of 1000) polls would show that 49% to 53% of people prefer Modi. Results can be made more accurate by increasing the sample size. The most desirable margin of error is 2% to 3% which is determined by the tradeoff- polling fewer people might yield fallacious results whereas increasing the sample size would mean more expensive polls. In fact, cases with tooth and nail majority statistics better the understanding of margin of error, as its presence can alter the game any moment to anybody’s favour, quite unprecedentedly.
There are a few defined ways in which sampling for political polls might give erroneous predictions. Among these, the ones discussed below are the most potent grounds for error. In fact, the second one was so overpowering that in 1948, all the major polling agencies indicated a landslide victory for Thomas Dewey, but Harry Truman bagged the presidency of the U.S. A bias similar to the first one accounts for the victory of the UPA-I in the Lok Sabha elections of 2004 where most polls signaled a sanguine NDA win.
The non response bias: This is the most immediate error. Suppose the polling agency in question has come up with a sample with minimal selection bias. Suppose further that among the sample size of 1000 (say), 30 people refuse to cooperate or respond to the questionnaire. This would mean replacement of the 3% of sample size and would invariably creep in some selection bias if the people who do not answer have different opinions from the ones questioned. And this, quite obviously isn’t something which can be remedied by increasing the sample size, because with increase in sample size, the probability of people not responding tends to increase. For instance, Rani works in a call centre in Calcutta. Out of the fifty calls she makes daily, ten people ignore her. If she decides to do overtime and makes eighty calls, ceteris paribus, the proportion of people ignoring her won’t decrease in spite of the increase in absolute numbers.
The response bias: The diametric opposite of the bias just discussed is the response bias. In this case, the respondents do not accurately voice their side of the opinion or casually give an answer which is not at all in accord with his/her actual political choice. This may also be caused when polling agencies publish unscrupulous results or are manipulated by political parties in question. To some extent the response bias can be curbed by increasing the sample size, but if the polls are publicized, results tend to be inaccurate. For example if a woman stands as a prime ministerial candidate, and I, even after being a staunch patriarch want her to win, but due to societal pressure voice my opinion otherwise.
Apart from these two, there are other oft quoted biases like the bias in coverage and the bias due to arrangements of words in a survey. In a country like India, most polls are concentrated in urban areas and highlight only a part of the electoral population. Rural population is infrequently surveyed due to communication problems as well as the problem of information. Response bias seems to work at its highest among the rural population, even in the case where the coverage bias is overcome. Nevertheless, after the advent of mobile phone technology, the polling exercise has become more convenient.
Bibliography and citations:
Crespi, Irving. Public Opinion, Polls and Democracy (1989)
Traugott, Michael W. The Voter’s Guide to Election Polls 3rd ed. (2004)
Glynn, Carroll J., Susan Herbst, Garrett J. O’Keefe and Robert Y. Shapiro. Public Opinion (1999)