# A confession to the Reverend Bayes and a modest proposal

A footnote (154) in Daniel Kahneman’s great book Thinking, Fast and Slow had me puzzled.

Consider a problem of diagnosis. Your friend has tested positive for a serious disease. The disease is rare : only 1 in 600 of the cases sent for testing actually has the disease. The test is fairly accurate. Its likelihood ratio is 25:1, which means that the probability that the person who has the disease will test positive is 25 times higher than the probability of a false positive. Testing positive is frightening news, but the odds that your friend has the disease …

(Before clicking to the jump for the answer, make your own estimate.)

…has risen only from 1/600 to 25/600, and the probability is 4%.

As a standard statistically challenged reader, I could not at first believe this counter-intuitive result. Surely, I thought, the test result collapses the prior probability into 1. Schrödinger’s poor cat is either alive or dead when he opens the famous semilethal box. The probability of the friend having the disease must be 96%.

I had the chutzpah to query this footnote as a mistake. Kahneman told me politely to read up on Bayes. I asked Mark and Keith for help. They agreed with Kahneman. Finally I got it. My method was simple, and may be helpful to others who either can’t follow the math, or, if they do, can’t integrate it into their everyday thinking about probabilities. The basic insight is that of Matthew 5:45:

He sendeth rain on the just and on the unjust.

Read “error” for “rain”. If there are more of the unjust, than more of them will get wet.

It’s easier to start with another of Kahneman’s examples, the delinquent cabby (Chapter 16). A taxi has been involved in a hit-and-run accident. All the cabs belong to either the Green company (85%) or the Blue one (15%). A witness has identified the cab as Blue. He is always confident in such identifications but is wrong 20% of the time. Given these facts, what is the probability of the cab being Blue?

Most people discard the base rate and simply follow the witness’ reliability of 80%. But let’s check what happens if the scenario is repeated 100 times with a perfectly representative sample of cabs.

• 15 cabs are Blue. The witness identifies 12 (80%) correctly as Blue, 3 falsely as Green.
• 85 cabs are Green. The witness identifies 68 (80%) correctly as Green, 17 falsely as Blue.

Of the 29 Blue identifications, only 12 are correct, or 41%.

Check with the standard formulation of the Bayes rule:
Pr (A|B) = Pr (B|A) * Pr (A) / Pr (B) where A is actually being Blue and B is being identified as Blue.
= 0.8 x 0.15 / 0.29 = 0.41.

Taking this on board involves repetition. So let’s check the diagnostic example, upping the representative sample to 10,000. Out of all these, 17  have the disease, all spotted by the test. The false positive rate is 4%, which applied to the 9,983 non-sufferers gives another 399 positives. The chance of a positive having the disease is 17 / 399, or 4.2%, as Kahneman says.

As long as we stick to two stages of evidence, working it out is just about as quick as applying a half-understood formula from the High Energy Magic building.  The exercise is within anybody’s abilities, and above all it’s intuitively far more convincing. It might even convince those influential Luddites Lord Justices Longmore, Toulson and Beatson of the English Court of Appeal, here, paragraph 37.

************************************

Kahneman provides abundant evidence that we generally approach problems with starting hypotheses generated automatically by our brains. He eschews ev. psy. stories, but it’s not a stretch to think that our brains are ancestrally adapted to identify potential threats and opportunities rapidly by jumping to conclusions. New-born babies have robust Kantian expectations of the continuity of objects, and are surprised when a ball turns into a duck behind a screen. So we are condemned to think in a Bayesian way or not at all, correcting preconceptions and first impressions as more data comes in. This is hard work and many people don’t bother. Education might improve if educators distinguished more systematically between inherently difficult, “unnatural” tasks – learning to read and write, statistical reasoning – and inherently easy, “natural” ones, like language, storytelling, art, music, investigation and teamwork.

One of the big applications of Bayesian logic is in drug testing. I’ll leave that to the experts here. Let’s circle back to the medical example. It’s not a remote case: consider PSA tests for prostate cancer, which I’ve been through. The base rate, the lifetime incidence of lethal prostate cancer, is about 1 in 36 in the USA. Five times as many will be diagnosed with it but will die of something else, often after a lot of expensive and pointless treatment. The PSA test often used for screening is so woolly that there isn’t even a recognized diagnostic level. (The trend over time, coupled with other symptoms like enlargement, gets doctors into a genuinely diagnostic area.) Low base rate plus unreliable test: Bayes counsels scepticism and sang-froid.

Over-treatment is a problem in all rich countries. It’s a particularly strong vice of American health care, driven by the Politician’s Syllogism, and a factor in its high costs.  Better statistical understanding by doctors could cut overtreatment and costs quite a lot. Two questions:
1. Are Bayesian methods taught in medical schools? I mean really taught, not just in a soon-forgotten compulsory course in research methods, but drilled in so as to become part of the mindset of any GP considering a diagnosis?
2. Are Bayesian methods built in to the AI clinical decision aids being developed?

My priors on these: no for 1; quite likely for 2, as the people doing this are likely to have serious statistical chops. [Update: anecdotal but expert confirmation of both priors from commenter Dennis below.]

To market its unwelcome idea, the trade also needs a better name than the eye-glazing “Bayesian posterior probability”. In many cases of interest, the second piece of evidence is a diagnostic test of some sort. I propose we call the probability of the test being right, given its sensitivity and the base rate, the resolution of the test. Please correct me if I’m reinventing the wheel.

******************************

Update

Commenters have not picked me up on it, but of course my conceit of confessing my errors to the Reverend Bayes is an anachronism. As a Presbyterian minister, he would have thought private confession a Papist superstition.

His career shows the price of another anachronism, the discrimination in Oxford and Cambridge until the 1830s against Nonconformists. He never married, so the equal silly rule that that dons be celibate would not have been a bar to him.  He was elected to the Royal Society, but an English university appointment was impossible. A very able man, Bayes wrote precisely two scientific papers in his lifetime. Archdeacon Paley, stuck in Carlisle with his large family, was more productive (and would have risen to bishop but for his politics, which were radical for the time). Still, the life of a country clergyman, isolated from his peers, was not conducive to steady intellectual work. An audience of even the most idle and dissipated students would have been more stimulating than Bayes’ congregation of Kentish farmers and shopkeepers.

In Scotland, with its more consistent Protestantism, there was no ban on married professors in its three universities  – one more than in much bigger England. Frances Hutcheson and Adam Ferguson were for instance married. Adam Smith never was, but was quite free to.  I wonder what part this freedom played in the Scottish Enlightenment?  Certainly celibacy and religious discrimination helped keep Oxford and Cambridge dozy backwaters.

1. Ralph says

you wrote: quite likely for 2, as the people doing this are likely to have serious statistical chops.

Well, we HOPE they have serious stat chops. But actually stats ignorance goes really deep. I don’t think even non-Bayesisan approaches are taught in medical school. I think medical stats instruction is along the lines of: does the test result fall within the supplied range? Yes? Then nothing to see here, move along. No? Let the treatment begin!

I will note that basic stats is beginning to appear in elementary and middle school education. It was completely absent from mine.

2. Ed Whitney says

The simplest formula is:
pretest odds X likelihood ratio= posttest odds.

The pretest odds in this case are 1:599. The likelihood ratio is 25. The posttest odds are 25:599. The probability is 25/624 which is 4%. This is close to 4.16% which is 25/600. When the odds are long, the odds are a good approximation of the probabilities.

It is difficult for many people to think in terms of odds. I ask my bookie for help with these analyses all the time.

The likelihood ratio is the likelihood of a positive result in a person with the disease (true positive rate, AKA sensitivity) divided by the likelihood of a positive result in a person without the disease (false positive rate, AKA 1-specificity). The witness of the cab accident has a sensitivity of 80% and a specificity of 80%; that is, when the cab is blue, he identifies it as blue 80% of the time (sensitivity) and as not-blue 80% of the time (specificity). The likelihood ratio for his identification is 0.8/(1-0.8) which is 0.8/0.2 which is 4.

The pretest odds of the cab being blue are 15:85. Multiply the pretest odds by the LR of 4 and you get odds of 60:85. This means the probability is 60/145 which is 41.4%.

Gotta go. My bookie is telling me that Lemon Drop in the third race is a sure thing.

• James Wimberley says

Thanks for explaining the likelihood ratio.
Your method is supposed to be easier than working it out from first principles? I can see that if you’v egot 50 to work out on a spreadsheet – say for 50 horses – , a formula is the way to go.

3. Philip says

And this is the guy spouting off with little humility on how certain he is that the age of solar is here today (as opposed to, say, a decade or two away).

• James Wimberley says

Are the data leading you to revise your priors here? My prior is that past rapid growth will continue. This has been a good hypothesis for 50 years and IMHO still is. It’s a highly conservative position.

• Philip says

From your own article, a couple of things I found quickly:

“Chinese manufacturers are still suffering with global oversupply, but the government is looking to create domestic demand to help soak up some of that excess capacity.”
China plans to increase installations fivefold by 2015. Japan has implemented solar incentives that “kicked off a development frenzy””

No one’s arguing that subsidies won’t drive growth.

The question is when solar costs will be competitive with fossil fuels. They are far from it now (hence the dependence on subsidies basically everywhere) in spite of you touting otherwise.

You lack the analytic skills to participate in this argument, but unfortunately your partisanship and emotional needs blind you to that.

I’m a solar supporter, I work in the industry, but I find poorly informed and illogical supporters like you to be a detriment to the cause.

Take a lesson from your text above – learn some humility about what you really know, and what strengths and weaknesses you have.

• James Wimberley says

“The question is when solar costs will be competitive with fossil fuels.” According to Deutsche Bank researchers, that is now for Los Angeles, Hawaii, Chile, Japan, South Korea, Australia, South Africa, Israel, Italy, Spain, and Greece, and 10 to 20 other markets within the next 3 years (including China, with a credible 35 GW target for 2017), making up three-quarters of the world market. Most of China, Japan, India, Brazil, Mexico, France, Britain, Chile, a good number of US states, and of course Germany among the jurisdictions hovering just over grid parity are likely to maintain their generally modest and tapering incentives over the same horizon. Accordingly there’s no reason to think the installation volumes needed to get down the learning curves to local grid parity will not be forthcoming. You should also remember than US electricity prices are anomalously low. Most of the world’s consumers pay closer to 20\$c per kwh than 10\$c.

You may also be (independently) right about my emotional needs. If solar doesn’t continue its boom, we are all screwed. I cheer on to help my righteous team win. It’s curious how many in the industry like you feel in their bones it’s bound to stop, betting regression to the multi-industry mean contrary to the industry’s own 50-year experience. They may be right about their own firms. This is Schumpeterian, Darwinian capitalism, red in tooth and claw, and most of the companies in solar today will fail or be taken over. Where are Stutz, Duesenberg, Pierce-Arrow, Peerless, Cunningham, Doble, Franklin and Marmon today? These early American carmakers had all failed by 1940, and others survived only as brands of General Motors.

• RichardC says

“when solar costs will be competitive with fossil fuels” depends a lot on what
you count in the costs – fossil fuels have a lot of externalities, such as
immediate environmental effects of mining and oil spills; health effects of
coal in particular; probable climate change; and trillions spent on military
actions in the Gulf. Anyhow, I run the numbers myself pretty often, and
paying about \$0.16/kWH it seems that solar systems with capital cost under \$2.00/W
are high permit and installation costs. But those are eminently fixable: if
we had the same policy here in MA for installing solar that we have for
installing attic insulation, i.e. no complex permit process, plenty of competent
installers, government grants cover large percentage of installation cost,
then just about everyone with a suitable location would do it.

4. Katja says

My problem with these examples is that they rely too much on verbal sleight of hand. If you look at the original text of the car example, you will notice that it is actually ambiguous (it’s even more ambiguous in the form it has been reproduced here). When you have such ambiguities, it’s difficult to argue that the error is always due to an incorrect assessment of the probabilities rather than the text being misleading.

Base rate fallacies do exist and can even cause a lot of harm (e.g.: racial profiling), but I just don’t find these examples terribly convincing and am not surprised that people who aren’t familiar with mathematical parlor tricks get tripped up by them regardless of whether there’s actually a deeper misunderstanding.

• James Wimberley says

Where is it ambiguous? To save space, I left out Kahneman’s testing of the witness’s reliability by the court, leaving it as a bald assertion. I also telescoped the witness’ absence of doubt. Ed Whitney managed to read my description right.

This is emphatically not a minor parlour trick. People lose their jobs and go to jail because of wrongly evaluated drug tests. One of the key cases in Britain when the judge denounced Bayesian statistics was a murder trial.

Kahneman’s key point here is that most people ignore the base rate if they are confronted with a good causal story. It’s a pervasive cognitive bias, not a rare one driven by prejudice or an artefact of stereotyping, a distinct mechanism.

• Katja says

James, you wrote: A witness has identified the cab as Blue. He is always confident in such identifications but is wrong 20% of the time.

“Such identifications” is ambiguous, nor is it clear what set of events 20% refers to. Your wording is also consistent with the following interpretation: If you provide the witness with a representative sample of cabs (about 85% of them green, about 15% as blue), then the identification of the car as blue will be wrong 20% of the time. In which case, 80% would actually be the correct answer.

Kahnemann’s phrasing is a lot more careful, but can arguably be also interpreted in such a way, and is almost certain to be misread by some people who don’t have experience with parsing mathematical questions. When you’re working backwards from an explanation, you may not see it, but phrasing mathematical questions such that they are unambiguous is actually fairly difficult.

A famous example is the following question: “Next Wednesday’s meeting had been moved forward two days. What day is the meeting now that it has been rescheduled?” Eventually you will realize that both Monday and Friday can be valid answers (because forward can mean two different things in this context). But most people don’t realize the ambiguity; quite a few people will still struggle with figuring it how other people can read it differently even once it has been pointed out to them that there are others who do interpret it differently. And the math in this case is dirt simple (addition or subtraction of the number two).

Note that I’m not saying that people don’t get conditional probability wrong. They do. But writing questions in such a way creates two types of people who get the answer wrong: Those who struggle with conditional probabilities and those who misunderstand the example. By getting the two types mixed up, rather than designing a question that singles out the first type, your conclusion becomes much weaker. This is why I’m calling it a parlor trick: it’s written to hide the critical information (that you also have to consider the rather large number of green cars misidentified as being blue). This makes it difficult to identify the original source of the error.

I realized that because I had originally read it the way I described above and it confused me: I saw the typical setup for a conditional probability question, and then, the way I was reading it, the conditional probability problem disappeared; because I knew that it was an example meant to illustrate something rather than a real world problem, I realized that I must have been misreading something, and then identified the problem.

• Ken Rhodes says

I have thought “in Bayesian arithmetic” for fifty years with no difficulty. I understand alpha errors (false positives), beta errors (false negatives), and “priors” (a priori assumptions about population attributes). To me it’s all as plain as the [rather prominent] nose on my face.

Yet I have never previously encountered the term “likelihood ratio.” It seems to me to be an oversimplification that would tend to lead casual readers down wrong paths. Is that terminology common in accurate technical writing? Is it a nice mathematical substitute for “alpha and beta error rates” in this type of thinking?

• Katja says

It will probably amuse you, but I don’t know (other than that likelihood and likelihood ratio are indeed terms of art [1]).

The thing is, I know relatively little about applied probability theory and statistics precisely because my background is fairly math-heavy.

College math (well, at least the college math I did) tends to be heavy on derivation from first principles and formal rigor, with applicability often taking a second seat. To some extent, that even held for the type of math computer science students did; that was more practice-oriented, but even fairly application-oriented stuff like discrete math (graph theory, number theory, and combinatorics) was littered with theorems and their proofs (our professor was of the opinion that only engineers would limit their study of the Chinese Remainder Theorem to the integers; as far as he was concerned, the generalization for Euclidean domains was the minimal interesting case).

Thus, the probability theory courses you take when you do mathematics in college don’t necessarily teach you how to apply probability theory to real world problems. We did probability theory starting with axioms and then building theorems on top of them as much as we actually used the methods we derived. (Math majors will probably get around to it eventually, but I’m merely a computer scientist with a heavy mathematical slant.)

I recall that Bayes’ Theorem was initially little more than a footnote when we did it. The reason is that it is a fairly direct consequence of the definition of conditional probability, but at that point in the typical probability theory curriculum doesn’t lead to a whole lot of other interesting results that you can afford to spend a whole lot of time on. Bayesian inference has important applications in the experimental sciences (and in some areas of computer science, e.g. Bayesian networks).

So, while I know Bayes’ Theorem and why it is important, I’ve practically never had to use it (or related techniques) in my day-to-day work. When I had to deal with conditional probabilities, Kolmogorov’s definition was pretty much all I needed (incidentally, the car example also arguably becomes easier and more intuitively understandable if you ignore Bayes’ theorem entirely and use Kolmogorov’s definition instead).

And while there are plenty of applications of probability theory and statistics in computer science, they have only so much overlap with the applications of probability theory and statistics in the natural sciences. Truth and falsehood tends to be rather black and white in mathematics and computer science; nobody can assign a realistic probability to the hypothesis that Goldbach’s conjecture is true, as opposed to the effectiveness of a drug test. Conjectures generally go from unproven to either proven or refuted (Imre Lakatos notwithstanding). This is less pronounced in computer science, but you can do a whole lot of good computer science research without ever having to touch even elementary statistics.

An interesting side effect is that when we computer scientists do have to deal with the experimental side of the scientific method (e.g., because we have to deal with the physical reality of actual computers), the results can occasionally be underwhelming.

[1] And no, I don’t know how many laypeople can even fully understand the diagnosis example, either.

5. Ken Doran says

The medical person who has to explain this to a patient not sure to have “statistical chops” might try the following: “Out of 1,000 people usually about two or so have Disease X. We don’t have a perfect test, but we have a preliminary test that clears about 960 out of a thousand right away. This time you were one of the 40 who were not cleared by that first test, but the likelihood is still high that you are not one of the two with Disease X. To be on the safe side, we will want to do more tests to confirm that you are one of the 38 or so who are still in the clear.”

• Keith Humphreys says

@Ken Doran: Well said indeed.

• Ed Whitney says

There are major differences in various ways of presenting data to patients, especially when matters of prevention are being discussed. Usually, when persuading people to do things like, say, take a statin drug for preventing heart attacks, clinicians emphasize relative risk reduction, not absolute risk reduction.

Suppose that your probability of a heart attack in the nest 10 years is 4%, and taking a daily statin reduces that to 2%. This means that you cut your risk by 1/2, a prudent thing to do. But your absolute risk reduction is only 2%. Take 100 similar patients who do not take the drug; 4 of them will have a heart attack in the next 10 years. Now consider 100 patients who do take the drug; only 2 of them will have a heart attack in spite of taking it.

This is a number needed to treat of 50 to prevent one heart attack. If you tell the patient that 49 out of 50 are unaffected by taking the drug for 10 years, that is as accurate as telling him or her that the risk is cut by half.

It is, however, much less likely to motivate compulsive adherence with the treatment regimen.

Both presentations of the data are intuitively easy, but the implications are very different. Saying, “There is a 98% chance that this drug will make no difference in what happens to you in the next ten years” constitutes an underwhelming sales pitch for its use.

• James Wimberley says

Excellent. However, “doing more tests” will only increase certainty if the second test is independent, or it’s the same test with error created by the lab process rather than by the patient’s physiology.

6. Dennis says

Bona fides: I teach in an Applied Stat Department. I don’t teach at a medical school. I have a cousin who does, and we’ve talked extensively. I have worked with Veterinary Med Schools in a previous position. I also worked in a State Health Department for several years, with physicians and epidemiologists.

Based on my exposure, the answer to your question (1) is, No. Med students are given some brief exposure to statistics in research methods. They core this up (including Bayes’ Theorem) and promptly forget it, aside from some Bayes’-inspired axioms like, “When you hear hoofbeats, think horses. Don’t think zebras!” They tend to forget, however, that if you are in Africa you should think zebras, because they are more common than horses. With regard to (2), the clinical assistance AIs I’m aware of are all based on Leo Breiman’s CART ideas. These themselves are based in large part on Bayes’ analyses, presenting diagnostic proposals in terms of posterior probabilities. So the answer to your question (2) is, Yes.

• James Wimberley says

Thank you. Data! Plus, it’s nice to have one’s priors confirmed – and painful to have them challenged, as Kahneman underlines.
Dr House’s cases are all zebras, and usually endangered ones at that.

7. tsts says

Wimberley citing Kahneman: “has risen only from 1/600 to 25/600, and the probability is 4%.”

Hmm, this seems sloppy. First, 25/600 is 1/24, or a little bit more than 4%. Second, it has risen from 1/600 to 25/(599+25) = 25/624, not 25/600.

Here is why: Let us assume the probability of a healthy person testing positive is x, and the probability of a sick person testing positive is 25*x. Then for a large population of z (say z = 10^8) people, we would have 1/600*z sick people and 599/600*z healthy people. We would then have n_s=25*x*z/600 sick people testing positive, and n_h=599*x*z/600 healthy people testing positive, and the chance your friend has the disease is n_s/(n_s+n_h).

Or what am I missing? This is fairly basic stuff, but I have had some wine, so maybe the math part of my brain is misfiring.

• Jonamike says

I worked through the algebra in a little more detail than you did, and I get the same result as you, 25/624.

I think that the error is that James said that “the false positive rate is 4% [i.e., 4/100 == 1/25]“, when the correct statement, based on the wording of the original problem, is that “the false positive rate is 1/25 *times* the true positive rate”. One then glues the false positive rate and the true positive rate together in a disjoint union, and out comes 25/624.

• Ed Whitney says

Your math lobes are fine; the passage as quoted would indicate a confusion of probabilities and odds. When the probability is low, the odds do not greatly differ; when the probability is high, they diverge.

If your horse is at even money, that is odds of 1:1, or a probability of 50%. If you double the odds, you have 2:1 in favor, which is a probability of 2/3, a good bet but not a sure thing. If you double the probability, you go from 50% to 100%.

25/624 is 4.0064% and 25/600 is 4.17%. In practical terms, there is no difference as far as your risk of disease; the difference is smaller than the rounding errors involved in the estimates of prior probability and likelihood ratios. Your plans for taking care of the patient are not affected.

But there is a major difference between a probability of 2/3 and a probability of 100%.

8. J. Michael Neal says

Maybe my reading comprehension is slow today or maybe the question is worded ambiguously but I think the proper answer to the question at the top is that we don’t have enough information to assess the likelihood that the patient has the disease. Don’t we need information on what proportion of those who take the test will test positive and what the rate of false negatives is as well as the rate of false positives?

So 1 in 600 people sent to take the test actually have the disease. But what if 99% of those that are tested test negative and, for simplicity’s sake the rate of false negatives is 0%. So, of the 600 people that are tested, only 6 of them have a positive result. That’s the number that needs to be compared to the 1/26 chance of a false positive, not the entire sample.

• Ed Whitney says

Good question; the solution lies in the fact that we were initially given a likelihood ratio (LR) of 25.

The LR is defined by the true positive rate, call it x, and the false positive rate, call it y. The LR can be expressed as x/y. For the convenience of dealing with integers, let us reckon everything in terms of percents.

In this case we know that the LR is 25. This means that we are dealing with two numbers such that x/y=25. This equation does not have unique solutions in x and y. For any given value of x, you can solve for y; for any given value of y, you can solve for x. The lack of unique solutions underlies J. Michael’s intuition that there is not enough information, but in giving us the hypothesis that there are no false negatives, he has told us that the true positive rate is 100%, and has thereby given us enough information to solve for the false positive rate.

If there were a test that had a false negative rate of 0% (which is a true positive rate of 100%), and we are told that the LR is 25, then we have been told that x is 100, and taking the equation 100/y=25, we know that y is 4, and that the false positive rate is 4%. Everyone with the disease tests positive; 4% of the people without the disease test positive, and a little over 4% of the total will test positive. So in this instance, it cannot be the case that 99% of the people will test negative.

J. Michael’s intuition that some information is missing is based upon the fact that for any given LR, there are many values of x and y which will yield that LR. If the true positive rate is 100%, the false positive rate is 4%; if the true positive rate is 75%, the false positive rate is 3%; ditto for respective rates of 50% and 2%. They all yield a LR of 25.

I hope that this clarifies the issue and does not add too many additional clouds.

• James Wimberley says

I just followed Kahneman in implicitly assuming a false negative rate of zero.
It’s interesting (to me at least) that a test with a false negative rate of 50% and a false positive rate of zero (over-the counter pregnancy tests? breathalysers?), which intuitively is pretty poor, eliminates the prior probability entirely. If you test positive, you have it. Prosecutors should look for such tests.

• Ken Rhodes says

Good idea. Doctors, too.

In your real-world experience, can you think of a test (of anything) with zero probability of a false positive?

• J. Michael Neal says

Sorry, but as far as I can tell, you’ve spewed out a bunch of gobbledygook that never even attempts to answer the question I asked. You not only need to know the false positive (true negative) and the false negative (true positive) rates, but also the percentage of tests that are positive to begin with.

To look at some specifics:

. . . but in giving us the hypothesis that there are no false negatives, he has told us that the true positive rate is 100% . . .

No, I didn’t. The fact that there are no false negatives tells us absolutely nothing about what the true positive rate is. A false negative rate of 0% tells us that every person that tests as negative is, in fact, negative. So it tells us everything we need to know about the true negative rate, but nothing at all about the true positive rate.

Again, I am positing a situation in which a negative result from the test is definitive. So the false positive rate only applies to cases in which the result of the test was positive. But without any information as to what percentage of the tests have a positive result, that information tells us nothing the meaning of a positive result.

• Ed Whitney says

Perhaps the difficulty lies in the definitions of true positive (TP), true negative (TN), false positive (FP) and false negative (FN). TP means that the person has the disease and test positive. TN means that they do not have the disease and test negative. FP means that they do not have the disease and test positive anyway. FN means they have the disease and still test negative.

If FN is zero, this means that no one with the disease tests negative. If you have the disease, you test positive, period. If you have the condition, you are 100% guaranteed to have a positive test; you cannot test negative. Everyone with the disease is either a TP or a FN. They cannot be TN; only people without the disease can do that, and only if they test negative. They cannot be FP either (only people without the disease can do that, and only if they happen to test positive). For people with the disease, TP and FN are mutually exclusive and collectively exhaustive. They have to be one or the other, and this means that if you know FN, you know TP.

What is true is that knowing TP tells you absolutely nothing about FP. Knowing that everyone with the disease tests positive gives you zero information about how often healthy people test positive as well. The fact that there are no false negatives tells us exactly what the true positive rate is, but that tells us nothing about the false positive rate. It tells us nothing about the overall positive rate.

In interpreting a positive test result, you need to know how many positive tests are true and how many are false. The number of TP divided by the sum (TP +FP) gives you the piece of information you are going to be most keenly interested in if you get a report that your test was positive. This quantity, TP/(TP+FP) is known as the predictive value of a positive test result.

So yes, you do need to know the number of FP results in order to get the predictive value of the positive test. That is implied in the example when we are told that the likelihood ratio (LR) is 25. The LR=TP/FP, and you are correct that knowing TN tells you nothing about FP and nothing about LR, but TN does tell you TP. Knowing TP and LR, you can solve for FP.

What is also true is that there an infinite number of combinations of sensitivity and specificity which can give you a LR of 25. Sensitivity of 100% and specificity of 96% (FP=4%) will do the trick, since 100/4=25. Sensitivity of 50% and specificity of 98% (FP=2%) will also have a LR of 25, since 50/2=25. You can have a test which is really crappy at detecting disease in people who have it, and if the test is highly specific, you can still have a LR of 25. Hell, if the sensitivity were only 1%, but the specificity was 99.96% specific (FP of only 1/25%), you still have a LR of 25, and a test which is worthless at ruling out disease but as good as gold at ruling it in.

9. Bloix says

“He eschews ev. psy. stories”

Kahneman posits that we have two different systems for solving problems, which operate in different parts of the brain. He calls them, simply, System 1 and System 2. System 1 is the default, and operates almost instantaneously and withough conscious thought or directed attention. System 2 monitors Sytstem 1 and actively comes into play only when System 1 can’t solve a problem or comes up with an answer that System 2 doesn’t accept. He claims that he can tell when one or the other is engaged because when System 2 is in operation, the pupils of the eye dilate.

So, for example, what’s 2 + 2 can be handled by System 1.
But what’s 17 x 23 requires System 2.

If you’re still reading this, you’ll note that you solved 2 + 2 without trying – it’s hard not to solve a problem that can be managed by System 1.
But you didn’t solve 17 x 23 and you’re not going to bother – System 2 requires directed attention and you can choose to use it or not.

Kahneman posits that System 1 and System 2 evolved because we need to do problem solving at every moment, often when we’re doing something else at the same time. So we’ve evolved simple heuristics that will do the work very quickly and without conscious attention. System 1 can recognize a problem as one we’ve solved before and can call up the memory that has the answer. It can even decide that a new problem is the same as an old problem and solve it. What’s 2 mogstors plus 2 mogstors? You can do that with System 1.

But System 2 evolved because System 1 can’t deal with new complex problems. We’ve evolved to dislike using System 2 (we’re “lazy”) because it takes a lot of energy and brain power and it distracts us from other things we need to do, but we can use it when we have to.

That’s his ev psych explanation.

• James Wimberley says

The evolutionary background is implied more than stated, and never flaunted. Most important, he never asks us to believe in a finding in psychology because it has a nice evolutionary story. The psychology is justified by repeated experiments on different continents. It’s there, so there is probably an evolutionary explanation.
To make a really strong evolutionary assertion, you would need to show that the cognitive biases are universals, not just widespread, like the coyness display. Prospect theorists would need to prospect in the Amazon and the New Guinea Highlands, bags of coloured marbles and rewards for betting games in their rucksacks. SFIK this has not yet been done.

10. harrync says

“Likelihood ratio” seems to be a very poor name for this concept, since it seems it can have little correlation with the likelihood that one actually has the condition.

11. DCA says

It is weird but true that the same problem that is obscure using probabilities becomes obvious using “x out of 10000″ statements; this has been termed “natural frequencies”. See
Hoffrage Lindsey Hertwig and Gigerenzer (2000), “Communicating statistical information” Science vol 290 pp 2261-2262, available online at

From personal experience, some doctors have learned this approach.

I think that it is better to give probabilities of false positive and negative than the likelihood ratio.

12. Ed Whitney says

There is something artificial about this whole scenario, which is that it presupposes some conditions which are most unlikely to be encountered in real life.

It presupposes that the entire world of patients can be classified into a two by two table, in which everyone unequivocally has the disease or does not have it, and in which tests are either positive or negative, with nothing indeterminate or in between. Disease/no disease and test positive/test negative are mutually exclusive and collectively exhaustive. But conditions like pulmonary embolus can come in such a variety of forms and degrees of severity that they defy the neat classification into you got it or you don’t; similarly, imaging studies for pulmonary embolus come in a variety of degrees of “positivity” and cannot be broken down into positive and negative with nothing in between.

Nevertheless, it is important for medical students to be taught to nail the questions involving the two by two table. It is analogous to those questions in high school physics in which the student is told to “ignore friction.” You need to learn the simplified and idealized problem before going on to the nuanced and complicated.

You would not want engineers who deal with real world physics situations to have only a weak command of high school physics with its artificial conditions in which principles can be mastered and later given nuance as complications are added in. It is a bad situation when too many clinicians in training are not able to handle the artificial and contrived problems with ease.

• Ken Rhodes says

Bravo!

You can’t decompose complex questions into first principles if you don’t have a good command of first principles.

13. Ed Whitney says

“He sendeth rain on the just and on the unjust.”

Funny thing; there is currently a testosterone-dense dispute about Richard Dawkins on the other thread, but Dawkins concedes the case for teaching Biblical literacy by pointing out that if you do not know Matt 5:45 you cannot enjoy rhymes like:

The rain it raineth on the just,
And also on the unjust fella,
But chiefly on the just, because,
The unjust hath the just’s umbrella.

• James Wimberley says

A nice epigram. My brother, who is more left-wing than me, used to put in in his email signature. I could not see a way of fitting it into my already stretched analogy.
Dawkins certainly is a swinging Dick.