Sampling footnotes

Harold’s nice post about sampling populations who spend varying spells in the state you observe reminds me of some other contexts in which the same principle sheds some light. But first, a puzzle that you can skip if you know “the one about the guy with the two girlfriends and the subway” :

One of my students was greatly in love, simultaneously, with a  girl who lived in Orinda and another who lived in the Mission in San Francisco.  He was pretty sure he wanted to marry one of them, but not sure which, so he decided to roll the dice of life.  Learning that inbound and outbound BART trains strictly alternated, first a westbound, then an eastbound, at the Rockridge station near his house (where trains in both directions stop on opposite sides of the same platform) during the relevant times of day, he carefully randomized his departure times from home, took his bike to BART, and got on the next train whichever way it was going, visiting the girl it took him to.  After a few weeks of this, he was astonished to find he had had five times more dates with the girl in the city and was engaged to her.

Art institutions like to put out PR boasting about their attendance, and nothing wrong with that. The Museum of Fine Arts in Boston proudly reports “we welcome more than one million visitors each year….”  That’s more than a fifth of the population of greater Boston, tots, geezers, hockey fans and all. What they welcome is a much smaller number of people who make more than a million visits, and if you’re not careful weighting the sample in your visitor survey, you can greatly deceive yourself about the population you’re serving, and related things like the amount of political support your customers can provide when they vote on your bond issue. Mary Jo Bane and David Ellwood changed everyone’s thinking about poverty and welfare in 1983 when they looked at the variation in how long people spent in poverty and not just at a cross-section of welfare recipients at a given time; the latter greatly inflates our perception of how many of the people who become poor are long-time dependents.  This lesson is the other side of Harold’s coin about long-term unemployed: the fraction of a population that has a problem is not at all the same as the fraction of the problem represented by an instance of it.

A related misapprehension flows from treating fractions like quantities.  We’re initially appalled to learn that heart disease is the leading cause of death among women, but it’s at least worth a moment to reflect on what would be a better leading cause of death: cancers that hurt a lot and last a long time? Personally, I’m ready to sign up for a quick heart attack; all women will experience one or another cause of death. Fewer heart attacks are better ceteris paribus, but the fraction of deaths they cause is mostly irrelevant to this policy.

In the same vein, we would do well to think more carefully about energy conservation in our houses.  If you are heating your house, turning off lights and substituting incandescent bulbs with CFLs transfers a fixed energy input requirement (what’s lost through the walls and windows) from your electric bill to your gas or oil bill.  There is usually some saving, but it’s far less than you might think, and of course if you’re air conditioning, those lights and appliances use up half again as much energy as they consume directly.  Implication: target lighting and appliance conservation in the south and the summer.

What happened to my student?  He thought he was sampling trains, but he was sampling intertrain time periods, and as it happened, the schedule was  W[estbound] 3:00, E 3:05, W 3:25, E 3:30, W 3:55 etc. [fixed 12:04, thanks Russell.  We only take off a few points for a sign error, right?]

Author: Michael O'Hare

Professor of Public Policy at the Goldman School of Public Policy, University of California, Berkeley, Michael O'Hare was raised in New York City and trained at Harvard as an architect and structural engineer. Diverted from an honest career designing buildings by the offer of a job in which he could think about anything he wanted to and spend his time with very smart and curious young people, he fell among economists and such like, and continues to benefit from their generosity with on-the-job social science training. He has followed the process and principles of design into "nonphysical environments" such as production processes in organizations, regulation, and information management and published a variety of research in environmental policy, government policy towards the arts, and management, with special interests in energy, facility siting, information and perceptions in public choice and work environments, and policy design. His current research is focused on transportation biofuels and their effects on global land use, food security, and international trade; regulatory policy in the face of scientific uncertainty; and, after a three-decade hiatus, on NIMBY conflicts afflicting high speed rail right-of-way and nuclear waste disposal sites. He is also a regular writer on pedagogy, especially teaching in professional education, and co-edited the "Curriculum and Case Notes" section of the Journal of Policy Analysis and Management. Between faculty appointments at the MIT Department of Urban Studies and Planning and the John F. Kennedy School of Government at Harvard, he was director of policy analysis at the Massachusetts Executive Office of Environmental Affairs. He has had visiting appointments at Università Bocconi in Milan and the National University of Singapore and teaches regularly in the Goldman School's executive (mid-career) programs. At GSPP, O'Hare has taught a studio course in Program and Policy Design, Arts and Cultural Policy, Public Management, the pedagogy course for graduate student instructors, Quantitative Methods, Environmental Policy, and the introduction to public policy for its undergraduate minor, which he supervises. Generally, he considers himself the school's resident expert in any subject in which there is no such thing as real expertise (a recent project concerned the governance and design of California county fairs), but is secure in the distinction of being the only faculty member with a metal lathe in his basement and a 4×5 Ebony view camera. At the moment, he would rather be making something with his hands than writing this blurb.

20 thoughts on “Sampling footnotes”

  1. There’s a saying in football, “When your secondary is leading your team in tackles, you’ve got a problem.” Something about the heart disease statistics reminded me of this.

  2. OK, I appreciate what you’re saying about the “leading cause of death”, but if have a general goal of increasing lifetimes — and yes, that sweeps a lot of questions under rug, but it’s not a bad approximation — then attacking things in proportion to how many deaths they cause sounds like a pretty good scheme.

    1. It’s a reasonable heuristic, but this is a problem of the second-best. Before going all-in to delay heart attacks, it would be good to think what would replace them, and how soon.

      1. Actually, Michael, this is not a problem of the second best, either. It’s totally different. Sabermetrics has solved this problem correctly, and lots of scientists ought to take note of it.

        In baseball the sabermetricians long ago realized that the expectation of the single run you might score is less important than the expected total of runs, so the sacrifice is usually a losing tactic. Later, though, they also realized that the number of runs you score, or might score, is not the objective either. Winning the game is the objective, so the right question is “will this decision increase my expectation of winning?”

        We’re all going to die, so we can’t ask the binary question “what resource decision(s) will most lower the expectation of dying?” Instead, then the right question here, is “What decisions will have the best impact on remaining life expectancy?” Events/conditions occurring disproportionately later in life have less impact on further expected life. So for any individual the choices may be pretty clearcut, but for all of us as a group, the resource decision is more complex than most folks can even comprehend.

  3. I am not sure, but I might prefer cancer to a heart attack, because then you get to say goodbye to people you love.

    1. All the more reason to be showering them with love and affection now, while you’re healthy and they’re not in distress on your behalf. Get away from your computer and find a live person to hug now!

  4. “at least worth a moment to reflect on what would be a better leading cause of death”

    The leading cause of death is birth, accounting for 100% of the incidence.

    1. If the line went N-S passing through the Ithmus of Panama, you would end up with much fewer drownings. You would have much more hypothermia to deal with…

  5. The real cause of death is individuality, and you can blame that on sexual reproduction.

    To a first approximation, every amoeba now in existence has been continuously alive for a hundred million years or more.

  6. Michael,

    This is a bit off topic, but it’s something I’ve heard before and wondered about for some time now.

    You write:

    In the same vein, we would do well to think more carefully about energy conservation in our houses. If you are heating your house, turning off lights and substituting incandescent bulbs with CFLs transfers a fixed energy input requirement (what’s lost through the walls and windows) from your electric bill to your gas or oil bill.

    Is this true? Is there literature on this subject that is accessible via the web? My degrees are in physics, but in fields that have nothing to do with heat flow in residential buildings. Just enough knowledge of the issue to get myself in trouble. But I would have thought that the effect of incandescence bulbs on heating a house would be quite tiny. If anyone knows anything, I’d appreciate it.

    1. I don’t know for certain, but color me skeptical.

      It would imply that lightbulbs are (nearly) as efficient at generating heat from energy as oil and gas heatings are; given that even electric heatings are less efficient than oil and gas heatings, that strikes me as unlikely. It also doesn’t account for the fact that during summer, when you don’t need heating, your lightbulbs will generate wasted heat all the same.

      The situation is likely more difficult to assess once you consider the carbon footprint of either scheme. For example, in Quebec, 97% of its energy production doesn’t come from fossil fuels (the primary sources of energy in Quebec are hydroelectric dams), so electric heating is widespread and has a low carbon footprint.

      1. Katja, in reverse order:

        Your second point is definitely true. If the objective is defined as “reduce carbon footprint” then in those places in Quebec, inefficient use of electricity in the home is less relevant.

        Regarding your prior paragraph though, it is an interesting aspect of the physics that the efficiency you mention is irrelevant. For the electricity flowing through the light bulb, the total electricity energy consumed is converted to either light or heat; there is no third place for it to disappear. So any joules of energy input as electricity NOT converted to light are necessarily converted to heat. The deal about fluorescents is they convert a much higher percentage of the joules to light. Thus (a) they need fewer joules to create an equal amount of light (i.e., a lower wattage for equal brightness), and (b) they create fewer BTU of heat in the process. A double win.

        1. The light-bulb thing goes even further than that. Every photon of light produced (except for the ones that leave through the windows) is eventually absorbed and converted to heat (well, OK, also except for the ones that strike houseplants and result in photosynthesis).

          But even in the winter, incandescent lights don’t do you all that much good because the distribution of lighting-related heat production is unlikely to match the distribution of desired warmth. Unless your lighting consists of a bunch of followspots.

  7. Just a note on your MFA example: I don’t know if the MFA is carefully distinguishing visits from distinct visitors, but there is some tourism in Boston, so it’s not unreasonable that the MFA could have X distinct visitors a year, even if the number of visitors from the Boston area is much much less than X.

Comments are closed.