A footnote (154) in Daniel Kahneman’s great book Thinking, Fast and Slow had me puzzled.
Consider a problem of diagnosis. Your friend has tested positive for a serious disease. The disease is rare : only 1 in 600 of the cases sent for testing actually has the disease. The test is fairly accurate. Its likelihood ratio is 25:1, which means that the probability that the person who has the disease will test positive is 25 times higher than the probability of a false positive. Testing positive is frightening news, but the odds that your friend has the disease …
(Before clicking to the jump for the answer, make your own estimate.)
…has risen only from 1/600 to 25/600, and the probability is 4%.
As a standard statistically challenged reader, I could not at first believe this counter-intuitive result. Surely, I thought, the test result collapses the prior probability into 1. Schrödinger’s poor cat is either alive or dead when he opens the famous semilethal box. The probability of the friend having the disease must be 96%.
I had the chutzpah to query this footnote as a mistake. Kahneman told me politely to read up on Bayes. I asked Mark and Keith for help. They agreed with Kahneman. Finally I got it. My method was simple, and may be helpful to others who either can’t follow the math, or, if they do, can’t integrate it into their everyday thinking about probabilities. The basic insight is that of Matthew 5:45:
He sendeth rain on the just and on the unjust.
Read “error” for “rain”. If there are more of the unjust, than more of them will get wet.
It’s easier to start with another of Kahneman’s examples, the delinquent cabby (Chapter 16). A taxi has been involved in a hit-and-run accident. All the cabs belong to either the Green company (85%) or the Blue one (15%). A witness has identified the cab as Blue. He is always confident in such identifications but is wrong 20% of the time. Given these facts, what is the probability of the cab being Blue?
Most people discard the base rate and simply follow the witness’ reliability of 80%. But let’s check what happens if the scenario is repeated 100 times with a perfectly representative sample of cabs.
- 15 cabs are Blue. The witness identifies 12 (80%) correctly as Blue, 3 falsely as Green.
- 85 cabs are Green. The witness identifies 68 (80%) correctly as Green, 17 falsely as Blue.
Of the 29 Blue identifications, only 12 are correct, or 41%.
Check with the standard formulation of the Bayes rule:
Pr (A|B) = Pr (B|A) * Pr (A) / Pr (B) where A is actually being Blue and B is being identified as Blue.
= 0.8 x 0.15 / 0.29 = 0.41.
Taking this on board involves repetition. So let’s check the diagnostic example, upping the representative sample to 10,000. Out of all these, 17 have the disease, all spotted by the test. The false positive rate is 4%, which applied to the 9,983 non-sufferers gives another 399 positives. The chance of a positive having the disease is 17 / 399, or 4.2%, as Kahneman says.
As long as we stick to two stages of evidence, working it out is just about as quick as applying a half-understood formula from the High Energy Magic building. The exercise is within anybody’s abilities, and above all it’s intuitively far more convincing. It might even convince those influential Luddites Lord Justices Longmore, Toulson and Beatson of the English Court of Appeal, here, paragraph 37.
Kahneman provides abundant evidence that we generally approach problems with starting hypotheses generated automatically by our brains. He eschews ev. psy. stories, but it’s not a stretch to think that our brains are ancestrally adapted to identify potential threats and opportunities rapidly by jumping to conclusions. New-born babies have robust Kantian expectations of the continuity of objects, and are surprised when a ball turns into a duck behind a screen. So we are condemned to think in a Bayesian way or not at all, correcting preconceptions and first impressions as more data comes in. This is hard work and many people don’t bother. Education might improve if educators distinguished more systematically between inherently difficult, “unnatural” tasks – learning to read and write, statistical reasoning – and inherently easy, “natural” ones, like language, storytelling, art, music, investigation and teamwork.
One of the big applications of Bayesian logic is in drug testing. I’ll leave that to the experts here. Let’s circle back to the medical example. It’s not a remote case: consider PSA tests for prostate cancer, which I’ve been through. The base rate, the lifetime incidence of lethal prostate cancer, is about 1 in 36 in the USA. Five times as many will be diagnosed with it but will die of something else, often after a lot of expensive and pointless treatment. The PSA test often used for screening is so woolly that there isn’t even a recognized diagnostic level. (The trend over time, coupled with other symptoms like enlargement, gets doctors into a genuinely diagnostic area.) Low base rate plus unreliable test: Bayes counsels scepticism and sang-froid.
Over-treatment is a problem in all rich countries. It’s a particularly strong vice of American health care, driven by the Politician’s Syllogism, and a factor in its high costs. Better statistical understanding by doctors could cut overtreatment and costs quite a lot. Two questions:
1. Are Bayesian methods taught in medical schools? I mean really taught, not just in a soon-forgotten compulsory course in research methods, but drilled in so as to become part of the mindset of any GP considering a diagnosis?
2. Are Bayesian methods built in to the AI clinical decision aids being developed?
My priors on these: no for 1; quite likely for 2, as the people doing this are likely to have serious statistical chops. [Update: anecdotal but expert confirmation of both priors from commenter Dennis below.]
To market its unwelcome idea, the trade also needs a better name than the eye-glazing “Bayesian posterior probability”. In many cases of interest, the second piece of evidence is a diagnostic test of some sort. I propose we call the probability of the test being right, given its sensitivity and the base rate, the resolution of the test. Please correct me if I’m reinventing the wheel.
Commenters have not picked me up on it, but of course my conceit of confessing my errors to the Reverend Bayes is an anachronism. As a Presbyterian minister, he would have thought private confession a Papist superstition.
His career shows the price of another anachronism, the discrimination in Oxford and Cambridge until the 1830s against Nonconformists. He never married, so the equal silly rule that that dons be celibate would not have been a bar to him. He was elected to the Royal Society, but an English university appointment was impossible. A very able man, Bayes wrote precisely two scientific papers in his lifetime. Archdeacon Paley, stuck in Carlisle with his large family, was more productive (and would have risen to bishop but for his politics, which were radical for the time). Still, the life of a country clergyman, isolated from his peers, was not conducive to steady intellectual work. An audience of even the most idle and dissipated students would have been more stimulating than Bayes’ congregation of Kentish farmers and shopkeepers.
In Scotland, with its more consistent Protestantism, there was no ban on married professors in its three universities – one more than in much bigger England. Frances Hutcheson and Adam Ferguson were for instance married. Adam Smith never was, but was quite free to. I wonder what part this freedom played in the Scottish Enlightenment? Certainly celibacy and religious discrimination helped keep Oxford and Cambridge dozy backwaters.