If A Scientific Finding Was Retracted, They Know it Must Be True

Christopher Wanjeck lists the five biggest retractions of science in 2011. Some were honest errors, others were likely fraud. Here are the inaccurate findings that were later retracted:

(a) Closing medical marijuana dispensaries increases crime
(b) Butterflies once accidentally mated with worms, thereby creating caterpillars
(c) Appendicitis should be treated with antibiotics rather than surgery
(d) Litter breeds crime and discrimination
(e) Chronic fatigue syndrome is caused by a virus

The educative impact of these retractions will unfortunately be limited by two factors. First, although the mainstream media generally covers retractions, influential bloggers often do not. I would not single out any particular blogger for criticism when this is such a prevalent problem, but if you search on many websites that lavished attention on the initial appearance of the since-retracted findings you will often not find a retraction published later (I hope those bloggers just learning of these retractions are addressing them now on their sites if appropriate. There is no shame in having been taken in by the initial reports — lots of people were — but to not acknowledge that inaccurate content has gone out under your name seems a breach of bloggeristic ethics).

The other force limiting the influence of these retractions is that false finding (a) and to some extent (c) and (e) have become politicized. I searched on a few sites outside the MSM for retractions of the marijuana dispensaries finding and the first two I found illustrate the problem (I was sufficiently discouraged at that point to stop searching, but please, someone — anyone — post a list of advocacy groups/commentators who forthrightly acknowledged that the initial finding was retracted due to a serious scientific error…I am always ready to have my faith in human nature restored).

Tim Cavanaugh of Reason Magazine covered the retraction mainly by attacking the people who were right to be skeptical of the initial marijuana dispensaries report while he was touting its results. Kris Hermes of Americans for Safe Access claimed that ASA already had already done studies showing that the finding was correct (presumably misplaced until this moment) and went on to speculate that the retraction of the study was politically motivated. Similar reactions were the norm in many quarters after 2010′s biggest scientific retraction: The fraudulent linking of MMR vaccines to autism by Dr. Andrew Wakefield.

In those circles where putative findings are embraced not for truth value but for emotional impact and political utility, a retraction is the ultimate confirmation that a study’s results are true. After all “they” (there is always a “they”) couldn’t deal with the truth, so they had it suppressed. The surgeons’ guild had the guy who promoted antibiotics discredited, the pharmaceutical industry smeared the people who proved that CFS is caused by a virus, and the vicious drug warriors threatened the marijuana researchers into withdrawing their dispensaries and crime study results.

In psychologist Leon Festinger’s famous Doomsday Cult participant observation study, the research team wondered what would happen to the cult members’ faith when the world did not in fact end on the predicted day. After initial moments of shock, the cult members concluded it was their faith itself that had spared the Earth from destruction, which only intensified their commitment to the cult.

And so, alas, it goes.

Comments

  1. says

    That is a cheap hit against Cavanaugh. Most of his post is just bashing the anti-dispensary people who were making a big deal out of the retraction, not diving into some conspiratorial cultish lunacy. He does say the finding made “intuitive sense,” but for heaven’s sake, the title of the post is an unequivocal admission that the study was retracted. Anyway, the study wasn’t fraudulent, like the Wakefield fiasco, it was missing data. Continuing to say the finding made “intuitive sense” until RAND comes out with their repaired version is basically reasonable.

    I know you guys get a kick out of whacking the drug policy reform people DLC-style, but sheesh.

    • J. Michael Neal says

      As below, Keith’s statement was correct. Most of the post is about attacking those he disagrees with. Cavanaugh never says that there was political pressure to retract the study, but his mention of local governments doesn’t seem to have any relevance to the post unless it is taken as an insinuation that that’s exactly what happened. The second cite Keith uses does come out and make that accusation directly.

      • Keith Humphreys says

        And note that saying the study was retracted is a half-truth. The whole truth is that the study was retracted because it was fatally flawed, the critical data were absent from the database that was analyzed. Leaving that fact out of the title and story implies that the study was retracted for some other reason.

        Cavanaugh’s original article had the following first paragraph:

        As if every day doesn’t already bring us more reasons to thank Gen. Curtis LeMay, a new RAND Corporation study reaches the highly expected conclusion that neighborhoods suffer increases in crime when they drive away business.

        That’s way beyond noting it “made intuitive sense”, it’s uncritical cheerleading.

        To make my own animus clear, I don’t know if closing dispensaries increases or decreases crime. It’s a question that I don’t think is very interesting and is also hard to study. But I do know that someone who ballyhoos a finding that is withdrawn owes his or her readers a full and sincere retraction, and that is simply not what Cavanaugh did.

        • Kenneth Almquist says

          Cavanaugh explains the problem with the study near the top of his article:

          “Researchers at the Cold War-era think tank, however, used crime data from a site that only included statistics from the L.A. County Sheriff’s Department, not the Los Angeles Police Department.

          Opponents of the report also objected that the study did not attempt to establish how many dispensaries had actually closed down, as opposed to merely receiving a shutdown order.”

          Furthermore, he links to the retraction on the Rand web site in the first sentence of his article.

          I don’t think you’ve made a convincing case that the study was “fatally flawed.” Presumably the statistics from the Sheriff’s Department omit a lot of crimes that are only reported to the city police. One possible explanation for the data in the study would then be that when a dispensary shuts down, the amount of crime in the area stays the same or even increases, but less of it is recorded by the Sheriff’s Office. Can you suggest a plausible mechanism to explain how shutting down a dispensary would reduce the reporting rate? If not, I would you would have to concede that the most likely explanation for the reduction in reported crime is that crime actually decreased.

          Now, that could just be a statistical anomaly, but the possibility of statistical anomalies affects most research in the social sciences, so if you consider that to be a “fatal flaw” it is a “fatal flaw” that affects most social science research. Ideally we would like to see the pattern hold up not only when the analysis is re-run using crime data from the city police, but also when looking at dispensary closings in other cities.

  2. Dennis says

    Antibiotics instead of surgery for appendicitis? And it got published? That boggles the mind almost as much as the old gastric freezing study.

    Thanks for another classroom example, Keith.

    • Ed Whitney says

      Actually, the issue of conservative treatment of acute appendicitis in adults is far from settled, especially in questions pertaining to the urgency of operative intervention. The retraction notice of the article in question says that the editors retracted it because significant portions of it had been published earlier in other journals; one of these prior articles was a randomized trial comparing surgery and antibiotics in the British Journal of Surgery in 1995; this trial had reported that antibiotic treatment compared favorably to surgery, but with significant recurrence rates at one year.

      A recent systematic review in the Canadian Journal of Surgery (free access http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3195652/?tool=pubmed )examines the literature and concludes that surgery should still be seen as the gold standard for acute appendicitis. Several studies had reported comparable effectiveness of antibiotic and surgical treatment. The conclusion of the review is telling:

      “Authors do acknowledge that whereas antibiotics appear to have a potential role in the management of acute appendicitis, there is simply insufficient evidence currently to lead to an alteration in practice.”

      In other words, the issue is far from being a no-brainer whose answer is self-evident; there is some evidence supporting antibiotics, but it is not sufficient to lead to a recommendation to change practice. It got published all right; so did a sufficient number of articles in a variety of good journals that a careful review of the balance of the evidence was called for. Nothing to boggle the mind here, and if it is a classroom example I am not sure what it is a classroom example of.

      .

      • says

        To settle the debate, you would need a bigger and rigorously plaaned randomised trial. Since physicians have definite opinions about the best treatment – appendicitis is hadly a rare condition – how could this be set up?
        I get the impression from the Canadian article that a lot of studies are carried out that are statistically incapable of leading to conclusive results. Given the risks to patients of enroiling in a clinical trial for anything serious, this is not acceptable.

        • ShadowFox says

          It’s not a question of physicians’ opinions–how does one randomize it and how can patients be convinced to go for an unproven method (which has not been the subject of a retracted publication). “We have a very promising technique here that we can try instead of the customary surgery. You know that every surgery, however minor, entails risks, although the probability of fatality is quite small. This technique… er… we’re not quite sure how successful it is, but, if it fails, you can rest assured we’ll cut it out just as we would have otherwise. Now, would like to try the experimental treatment?”

          • J. Michael Neal says

            Also, how do you set up a control group? Pretty clearly, the patients are going to know whether they had surgery or not, so there’s no double blind. This is something I wonder about: how is it possible for the effectiveness of any non-surgical treatment to be compared to that of surgery?

          • Ed Whitney says

            Blinding of trials involving interventions that cannot be blinded always leaves some uncertainty about bias in the results.

            Fitzmaurice used Randomized Trial outline of the Critical Appraisal Skills Programme for reviewing the studies in his review. http://www.caspinternational.org/mod_product/uploads/CASP_RCT_Checkist_14.10.10.pdf (reference #11 in his review, but don’t click on it since it seems to be a dead link) gives the criteria for evaluating a clinical trial. Blinding is question #4. There are 11 questions in all to apply to any particular trial. Even though blinding of the patient cannot be expected, it is still possible to control other sources of bias. Randomization is the most important criterion, in order to compare groups which can be expected to have similar outcomes if similarly effective interventions are applied; if the outcomes are different, randomization gives you the best shot at attributing the difference in outcome to the differences in the interventions (rather than to some other factor such as age or general health status).

            Fitzmaurice did point out many good reasons to look skeptically at the studies reporting comparable effectiveness between antibiotics and surgery. The question was considered worthy of evaluation in part because of good evidence that many cases of appendicitis resolve spontaneously, and because other intra-abdominal infections like diverticulitis are routinely treated with antibiotics.

            I mentioned all of this because it seems that the Wanjek article implied that the antibiotic treatment of appendicitis was a fringe opinion appearing in a flaky journal and that the retraction of one weak paper settled an issue that everyone with half a brain already knows the answer to. This is a misleading representation of the state of things.

          • MobiusKlein says

            As I have had suspected appendicitis twice (in the age before MRI / cat scans) I can vouch that the situation can resolve spontaneously.
            So yes, there is a case to be made that you don’t have to jump right into surgery immediately – depending on evidence gathered from examinations.

            And since surgery is a risky, expensive procedure, we should cast a critical eye to when it can be avoided.

        • Ed Whitney says

          Interesting story, Mobius. Clearly your appendix did not perforate. If there were crystal balls which could tell when an appendix is likely to go on to perforation, the practice patterns might change.

          One other point about blinding. This is the best way to avoid biases that arise from patient expectations (AKA the placebo response). As a rule of thumb, the more invasive the intervention, the greater the placebo response; if you take a pill, you may have some expectations of benefit which will translate into a measured therapeutic response. If you have an injection, that response tends to be greater than with swallowing a pill. If you have an operation, your expectation of benefit tends to be even greater.
          Thus, if there were biases arising from lack of blinding, they would tend to be in favor of surgery and against antibiotics. After all, everyone knows that surgery cures appendicitis, but medication alone is likely to have much less powerful a placebo response. Since the potential bias goes against the direction in which the results of the disputed antibiotic trials were reported, lack of blinding is not the place to look for reasons to dismiss their findings. The Fitzmaurice article discusses other weaknesses of the trials which warrant its reluctance to recommend changes in practice at this time.

      • Dennis says

        And the telling fact is, “…with significant relapse rates at one year.” The consensus number in the studies examined by the CJS review is ~10% relapse rate. There appear to be biases in the study designs that lead to surgery for the more acutely ill patients. This would probably tend to understate the relapse rate if antibiotic treatment were standard. It would be interesting to know what fraction of those relapses involved perforation of the appendix.

        I think the CJS study has the right conclusion, but buries the lede. It isn’t simple absence of evidence that antibiotics alone are acceptable, but the absence is combined with evidence that a substantial fraction of the antibiotic patients will be back within a year. Do the surgery and be done with it. Better yet, use antibiotics as a bridge and do laparoscopic surgery.

  3. says

    The butterflies story sounds like material for the IgNobel. Anyway, was it supposed to be by mating or another method of gene transfer? The evolution of insect metamorphism sounds like a real puzzle. It´s hard to see how incremental change can drive it, as Darwin convincingly suggested for the evolution of the complex eye.

    • MobiusKlein says

      I would suggest asking that question to an actual Evo-Devo specialist.

      Incremental change can do wonders in many ways – consider the various stages of mammalian changes in the womb. It’s a sort of metamorphosis, just more hidden.

  4. says

    Can you give me the source of your information that the medical marijuana dispensary and crime findings were “inaccurate”? My understanding was that the study was pulled because (assuming it wasn’t due to political pressure) of concerns about the size and parameters of the study, not because the findings were discovered to be inaccurate.

    At most, you can say that the findings were not sufficiently supported by the data, not that they were inaccurate.

    I assume that this inaccurate comment in your blog will be addressed.

    • J. Michael Neal says

      If the claims of the paper were insufficiently supported by the data, then it was inaccurate in those claims. Keith’s statement is perfectly accurate.

      • says

        So if i did a study of people on the street and found that the majority had noses, but was criticized because I didn’t conduct the study over enough days or see a sufficient number of people for proper data analysis, would that make the findings inaccurate?

        No. Still turns out that people have noses.

        • ShadowFox says

          If you did a study on the street to see if people are dead and found out, oh, well, they are not dead–because, here they are, walking down the street–but then criticized for insufficient data, would you be making the same argument? Would you still say that the study’s conclusion was valid–if only it had more data.

          Your entire argument is absurd. The reason why we call data “insufficient” is because no meaningful conclusion can be drawn from them. Being convinced that the results are valid is no substitute for data in this case. Because, well, your intuition can just be wrong. Until there is more valid data, this case is closed. The study was incomplete and its results useless. Period.

        • MobiusKlein says

          Sample size matters. Random sample selection matters.

          If you are trying to estimate the percentage of humans with one nose, you should visit more than the failed plastic surgery convention.
          Likewise, a sample of 100 folks might lead you to believe 100.000% of humans have a nose, rather than 99.999%

        • J. Michael Neal says

          No. Still turns out that people have noses.

          Indeed. Perhaps this is because a truly stupendous amount of data has already been collected on the subject, most of it relevant. If we discard *your* survey, we find that there remains sufficient evidence to state a conclusion.

          If one’s study is on something that has not been previously studied, then finding gross flaws in it does truly require retraction of the findings.

          • Anonymous says

            Guther’s comments are a disturbing illustration of the mentality described in the post, not least because he seems to think he has humbled the rest of us with his special understanding of mystic truths that only he can see.

  5. Toby says

    Physicist Dr Richard Muller, a self-styled “climate sceptic”, published evidence for global warming, gathered by a team of scientists (including a Nobel winner) under his leadership.

    http://online.wsj.com/article/SB10001424052970204422404576594872796327348.html

    The WSJ only published his op-ed in its international web edition. Presumably, the American public needed to be protected from his news.

    This is not quite a “retraction”, and Muller essentially re-hashed work already done by other scientists. Worst of all, extreme science deniers soon began to manufacture reasons to dismiss his analysis. But it is worth noting.

    http://berkeleyearth.org/available-resources/

    • ShadowFox says

      The study has been mentioned in a lot of blogs and MSM publications. The fact that Murdoch Media restricts access to it does not mean that no one knows about it.

  6. Toby says

    I should mention a retraction that did not make the list, but was listed elsewhere as one of the great science scandals of 2011.

    Profressor Edward Wegman of George Mason University had a paper retracted by the journal Computational Statistics for plagiarism. Wegman’s paper was a purported refutation of the “Hockay Stick” chart. In its original version, it was a report prepared for Congressman Joe Barton of the Congress Energy Committee, and received maximum publicity when originally published.

    Response to the retraction of this politicized paper/ report was muted in the media, except in USA Today, and the retraction has led to no pause for reflection among climate science deniers, inside or outside of Congress.

    http://www.climatesciencewatch.org/2010/02/12/deep-climate-investigation-of-denialist-and-%E2%80%9Cskeptic%E2%80%9D-attack-on-hockey-stick-temperature-record/

    • says

      Well noted, Toby. Both these are IMHO far more significant for policy than Keith´s examples. The Wegman case is prima facie fraud – much of the paper was plagiarised – and is being investigated as such by his university and publisher, at snail´s pace.

      At least the science world has a mechanism for correcting published mistakes. Where are the retractions for all those op-eds supporting the invasion of Iraq? The blogosphere is a little better than the MSM: I think we are quite good here at the RBC in admitting our (rare!) demonstrated mistakes.

  7. says

    You made a common error in your coverage of the retractions in the case of M.E./CFS – you said there was no evidence it was caused by a virus. To the contrary, there is considerable evidence of viruses that may be the cause – r an effect of a damaged immune system that then causes other woes. The Article in “Science” that was retracted had to do with a nvel RETROvirus. True, evidence appears glum at the moment fr proponents of that thesis, but no ne at the time could have known that XMRV might have been the product of a marriage between two snippets of gamma retrovirus that recombined in lab mce with much-used immortal cells. So there is no fraud here. Rather, there is now a second thesis, which still needs to explain how patients ended up testing positive for antibodies (yes, I get that the antibodies were reacting to something else – but what? How remarkable that in dealing with a severe life-long debilitating illness, affecting a million American adults and untold numbers of school-aged young people, mainly teenagers, no one has been breeders in finding out jet what some of us DO have antibodies to if not a gamma retrovirus.

    Also, given that it is Dr. Coffin’s thesis that the actual “contamination” occurred in the Cleveland Clnuc, where they had Ben working for several years on a link between some types of prostate cancer and XMRV, why haven’t THOSE articles been retracted?

    Finally, the NIH-funded study headed respected virologist Ian Lipkin to try to get to the bottom of all isn’t finished yet – rather oor form to require retraction beforehand.

    In my own field (in the social sciences), we would only retract an article if it was actually fraudulent. We allow disagreement.

    As for your initial error, most researchers believe “CFS”, which is more of a social construct than a well-defined disease (leading t serious problems when ne study measures apples and another measures golf balls), is a heterogeneous catch-all. A significant number of patients – probably the majority. A report that their experiences with the illness began with a flu-like episode – often (but not always), Epstein-Barr. There has been speculation about setting off an autoimmune response, but for a subset of the patients reportmg flu, there is immune dysfunction, including immune DEFECTS. These patients also show contiinuing evidence of chronic viral infection, with the culprits including the beta herpesvirses (HHV-6, particularly Variant A; HHV-5 (cytomegalvirus); and HHV-7. Also resent in a number of patients is evidence of an ongoing infection with Coxsacke B, parvovirus, and adenoviruses.

    CDC likes to pretend that these viruses must be proved to CAUSE the disease or they do not matter. But of course they do matter, particularly if in the spinal fluid, because the subset with M.E. ( Myalgic Encephalomyelitis) has, literally translated ( and consistent with symptoms), generaozed muscle pain,encephalitis, and serious CNS dysfunction. I should also point out that for years this condition was called atypical polio, and the British researcher who had been working with M.E. Since helping coin the term published a textbook and revision (1986, 1988) in which he made it clear he thought that M.E. was of viral origin.

    So much more to be studied! But NIH claims to allocate only $6 per patient per year – 1% of the funding for M.S., hardly an underfunded disease – and a close search f where the grants actually went leaves you with $1.65 per person per year. As for CDC, the only treatments they suggest are SSRIs, sleeping pills,Cognitive Behavior Therapy, and Graded Exercise.

    So it matters greatly whether you have a large subset of patients with a reactivated or opportunistic virus. It matters greatly if they have significant immune defects, and/or a problem with autoimmunity. It matters because some of the one million sufferers could at last get treatment.

    I suggest you retract the statement suggesting there’s no relationship between CFS (or M.E. – not the same, but they overlap as diagnoses) and viruses. But then again, as a scholar I can only agree with retraction in cases of fraud. Your error was perfectly understandable – so just mention that there is considerable ongoing work on the relationship between CFS/M.E. and viruses.

    Thank you.

    • Cardinal Fang says

      “In my own field (in the social sciences), we would only retract an article if it was actually fraudulent. We allow disagreement.”

      So, Mary, can you clarify? Let’s say I publish an article in a social science journal. Suppose further that my research is in no way fraudulent. Now let’s say it’s later discovered that my data in no way supports my claims: maybe my math was wrong, maybe I had an insufficient sample size, maybe my logic was wrong, maybe the science supply company mislabeled my chemicals or unbeknownst to me I was using the wrong plants or animals, maybe I was an incompetent researcher so my data was wrong. But no fraud. In such a case, are you saying my paper should not be retracted, even though it’s out there in the literature and it’s wrong?

      Allowing disagreement is one thing, but allowing bad science is quite another. Scientists who believe my claims should not be using my flawed paper to support their beliefs.

      • Rob says

        There has been no data presented in any study that refutes the findings in Lo or Lombardi et al. the two papers that found polytropic MRVs to be infecting people with ME.

        You can be forgive for not realising this because you did not pay enough attention and read all the studies published.

        It is bad science to retract papers to support political beliefs and vested interests.

      • says

        Do you mean that it’s just a crummy job of research? Somebody would write that it was a crummy job of research. Period. It wouldn’t be retracted. Economics. Sociology. Political Science. History. Unless it was fraudulent, it would not be retracted.

        I can point to lots of crummy articles that showed up in good journals, believe me. There are also cases where the thesis was pretty soundly refuted – but the article would not have been retracted. And there are also some famous examples where the article was considered ridiculous at first, and later became a seminal work – a new idea that the profession wasn’t ready for yet.

        So when science journals retract things, the rest of us think – “Huh! Must have been fraud!”

        Especially in this case (the CFS-retrovirus theory), with people actually clamoring to withdraw funds, it smacks too much of censorship for me.

        Mary

  8. says

    My apologies for all those typos – that’s what happens when Amtrak meets an I-Pad. The British researcher was Melvin Ramsay. Don’t know how I-Pad turned “curious” into “breeders.”. For most of the typos just add an appropriate consonant. (oor should be poor).

  9. ShadowFox says

    A number of other retractions occurred last year–or their impact was felt last year. Japanese virologist Naoko Mori lost her job after a bunch of her papers had been retracted late in 2010 (http://goo.gl/K935n). It seems the results–particular the manipulation of images–had been questioned by one of the coauthors.

    In the follow up to that particular set of retractions, Ferric Fang made a comment:

    “That’s certainly a possibility. Extraordinary claims require a higher bar before the scientific community accepts them, and I think some of this work that’s published in the glamour mag journals—Science, Nature, Cell—are in those journals because they’re sensational: things like the arsenic using bacterium for example, or the novel murine virus that was associated with chronic fatigue syndrome. These claims, because they have such enormous implications and because they’re so sensational, they’re going to be subjected to a very high level of scrutiny. If that claim was made in an obscure journal, it might take a longer time [to] attract attention.”

    Interestingly, both claims that he mentioned were subject of retractions as well (I am not sure if he implied that or if he thought the claims to be dubious even before the retractions–likely one each, as the chronic fatigue paper had been retracted before the Fang interview and the arsenic paper suffered a number of technical corrections but is yet to be retracted). And the pure-arsenic-eating bacteria claim is a very high-level retraction–but, I suspect, it was left out by Wnajeck because it’s not human-interest and is not as sexy as the butterflies.

    Another high-profile retraction was the identification of “longevity genes”. In this case, the retraction cycle was less than a year, but the flaws were pointed out almost from get go. There was no fraud or missing data in this case–the errors arose from the use of a faulty chip that’s been known to produce false results (http://goo.gl/rYyQJ). This is a new type of retraction–usually it’s human error that implicated (when it’s not outright fraud).

    In other retraction news, although the Moral Minds (Cognition, 2002) had been retracted earlier, it’s author, Mark Hauser, resigned from Harvard in 2011, following what amounted to a vote of no-confidence from his colleagues. More retractions may yet follow.

    Also, another batch of Sylvia Bulfone-Pau’s papers has been retracted that put in doubt her laying blame for previous mistakes on two Russian post-docs who had been fired. In fact, that paper was written while Bulfone-Pau was working at an entirely different research center and had no connection to the alleged post-doc culprits.

    • Rob says

      But which retractions are justified and how does science progress when all studies that are distasteful to a powerful few can be pulled out of sight?

Trackbacks