Whoever taught reporters the phrase “statistically significant” without at the same time teaching them the meaning of that phrase did no one any favors. In an otherwise interesting article about Republican support for Barack Obama, Patrick Healy of the New York Times writes:
Based on recent polls, as well as interviews with Obama advisers, Republican voters are not moving to Mr. Obama at a greater pace than they moved to Senator John Kerry, the Democratic nominee in 2004. In the most recent New York Times/CBS News poll, conducted this month, 9 percent of Republicans said they would vote for Mr. Obama if the election were held today; at the same point in 2004, 6 percent said they would have supported Mr. Kerry, a statistically insignificant difference.
I know this is a hard concept for some reporters to understand, but it turns out that nine is larger than six: to be precise, it’s 50% larger. If the poll shows 9% Republican support for Obama vs. 6% for Kerry, it shows that Obama has 50% more Republican support than Kerry had. There’s an error band around that estimate, larger or smaller as the sample size of the poll is smaller or larger; if that error band for 95% confidence includes zero, the difference is not statistically significant at the .05 level; that is, there’s more than one chance in twenty that sampling error alone could have accounted for the difference, even if the underlying true proportions were identical. In that case, the right thing to say is “Based on measurements done so far, it’s not possible to say with confidence that Republican voters are moving toward Mr. Obama at a faster pace than they moved toward John Kerry.” But for whatever the measurement is worth, it’s evidence that Obama is doing better than Kerry, not that he is failing to do better than Kerry.
My handy-dandy on-line binomial significance calculator tells me that you’d need samples including about 600 Republicans today and 600 from 2004 to make the measured difference at the 95% level. Since Republicans are increasingly rare, that would require a total sample size near 2000 to do the trick, which is more than most newspaper polls want to pay for. But it’s not, for example, more than you could get from three days of Gallup daily tracking. However, since newspapers seem to have a taboo on using polling data that doesn’t carry their brand name, the reporter never called Gallup and asked for those cross-tabs. Or, more likely, the reporter treated “not statistically significant” as some magical quality of the data, rather than a common-sense problem to be worked around.
Really and truly, folks, this not what a friend of mine calls “rocket surgery.” They’re teaching this stuff in high schools now. (And a good thing, say I.) There’s no excuse for newspapers to be spreading innumeracy.