Harold’s nice post about sampling populations who spend varying spells in the state you observe reminds me of some other contexts in which the same principle sheds some light. But first, a puzzle that you can skip if you know “the one about the guy with the two girlfriends and the subway” :
One of my students was greatly in love, simultaneously, with a girl who lived in Orinda and another who lived in the Mission in San Francisco. He was pretty sure he wanted to marry one of them, but not sure which, so he decided to roll the dice of life. Learning that inbound and outbound BART trains strictly alternated, first a westbound, then an eastbound, at the Rockridge station near his house (where trains in both directions stop on opposite sides of the same platform) during the relevant times of day, he carefully randomized his departure times from home, took his bike to BART, and got on the next train whichever way it was going, visiting the girl it took him to. After a few weeks of this, he was astonished to find he had had five times more dates with the girl in the city and was engaged to her.
Art institutions like to put out PR boasting about their attendance, and nothing wrong with that. The Museum of Fine Arts in Boston proudly reports “we welcome more than one million visitors each year….” That’s more than a fifth of the population of greater Boston, tots, geezers, hockey fans and all. What they welcome is a much smaller number of people who make more than a million visits, and if you’re not careful weighting the sample in your visitor survey, you can greatly deceive yourself about the population you’re serving, and related things like the amount of political support your customers can provide when they vote on your bond issue. Mary Jo Bane and David Ellwood changed everyone’s thinking about poverty and welfare in 1983 when they looked at the variation in how long people spent in poverty and not just at a cross-section of welfare recipients at a given time; the latter greatly inflates our perception of how many of the people who become poor are long-time dependents. This lesson is the other side of Harold’s coin about long-term unemployed: the fraction of a population that has a problem is not at all the same as the fraction of the problem represented by an instance of it.
A related misapprehension flows from treating fractions like quantities. We’re initially appalled to learn that heart disease is the leading cause of death among women, but it’s at least worth a moment to reflect on what would be a better leading cause of death: cancers that hurt a lot and last a long time? Personally, I’m ready to sign up for a quick heart attack; all women will experience one or another cause of death. Fewer heart attacks are better ceteris paribus, but the fraction of deaths they cause is mostly irrelevant to this policy.
In the same vein, we would do well to think more carefully about energy conservation in our houses. If you are heating your house, turning off lights and substituting incandescent bulbs with CFLs transfers a fixed energy input requirement (what’s lost through the walls and windows) from your electric bill to your gas or oil bill. There is usually some saving, but it’s far less than you might think, and of course if you’re air conditioning, those lights and appliances use up half again as much energy as they consume directly. Implication: target lighting and appliance conservation in the south and the summer.
What happened to my student? He thought he was sampling trains, but he was sampling intertrain time periods, and as it happened, the schedule was W[estbound] 3:00, E 3:05, W 3:25, E 3:30, W 3:55 etc. [fixed 12:04, thanks Russell. We only take off a few points for a sign error, right?]