#Nerdpost: Length-biased sampling

This post was inspired by a tweet. @AnnieLowrey praised a beautiful New York Times story by Pam Belluck about dementia among geriatric prison inmates. Lowrey expressed surprise that “21 percent of America’s prisoners are serving 20 years to life.” That is indeed a surprising number. It reflects a statistical and operations research principle you may not have considered, but which turns up all over the place in health care and public policy. It’s called length-biased sampling, and it’s worth some thought.

The easiest way to understand the basic issue may come from a simple example. Suppose you went over to the University of Chicago Medical Center last Thursday and surveyed every hospitalized patient. Many of the people you met would have been in the middle of a long-term stay. Why? Because people who only needed to stay one or two nights walked out before you could meet them. The stock or cross-section of hospitalized patients you met yesterday was a very different, much sicker sample than you would have found, had you specifically surveyed every patient who began their hospital stay on that same day.

In similar fashion, murder is a rare crime. Yet because each murderer contributes many person-days to the prison population, murderers are grossly over-represented in the cross-sectional population of prisoners incarcerated on any given day.

In various disguised forms, this same problem appears in many other contexts. Suppose your morning bus doesn’t really follow a schedule. There is the same probability it will arrive every minute, and on average one arrives every 15 minutes. You walk outside at a random moment. You might think that you will wait, on average only 7.5 minutes. You will actually wait, on average, that full 15 minutes because more minutes of the day happen to fall within some long interval in which the bus was delayed.

Statisticians have noted this issue for decades. When you are surveying people from an underlying population, someone’s individual characteristics may influence whether or not she will appear in your sample. So the characteristics of your study sample may systematically differ from those of the underlying population you really wish to understand. You survey lung cancer patients about their satisfaction with care. Your results may be misleading because you can only ask the question of patients well-enough to take your survey. The really sick patients and those who have died aren’t available to you.

An interesting aspect of the situation is easily missed. This isn’t just an issue for statisticians. It is often important in thinking about the substance of public policy. The group of people we actually intervene with on any given day–welfare recipients, prisoners and arrestees, patients , and others–are often a length-biased sample of an underlying population we have opinions about or want to understand. And the two populations may look quite different over time. The population of active criminals is different from the population behind bars, for example. And these differences matter.

I learned this subject from a fantastic paper on labor force dynamics and unemployment by Larry Summers and Kim Clark, written thirty years ago. Summers and Clark noted that the chronically jobless are grossly over-represented in the population of currently-unemployed people—a pattern sadly resonant with our situation today. If we want to increase employment, we ultimately must address the problems facing the long-term jobless.

With my colleagues Peter Reuter and Eric Sevigny, I’ve applied similar methods to some questions of criminal justice policy, in particular the potential of drug courts and other diversion efforts to reduce the prison population.

To get a little math-scary, suppose that prisoners are incarcerated at some constant rate of λ per unit time. Moreover, suppose that the distribution of time actually served (T) by any newly-arriving cohort of prisoners is well-described by some probability density function f(T). Suppose prison terms have some mean μ and some variance σ2. This oversimplifies things in various ways, but the essentials turn out to be clarifying.

The population characteristics of prisoners who remain incarcerated on a given day (say March 17, 2012) will not follow the same distribution f(T). Someone convicted of armed robbery in 2002 is much more likely than a shoplifter to remain incarcerated ten years later. In fact, if g(T) is the sentencing distribution in among prisoners actually incarcerated today, it turns out that g(T)=Tf(T)/μ. Not surprisingly, the average prison term of currently-incarcerated prisoners–say M–is larger, too. It turns out that this average is given by the relatively simple formula M=μ[1+ (σ2/ μ 2)].*

There’s some relatively simple intuition here. The probability of observing prisoners with of any given sentence length is proportional to the original probability that prisoners will be sentenced to that term, multiplied by the length of the sentence. If one wants to know how a given crime—say armed robbery—directly influences the current prison population, what really matters is the number of person-sentencing-years associated with that crime.

Many people, on both sides of the political aisle, hope that we can reduce the prison population by finding better alternatives for low-level nonviolent drug offenders. This is a good idea, and we could notably reduce the inflow of people into prison if we followed these policies.

More sensible policies might indeed reduce mean sentences among all entering prisoners. Many practical obstacles impede these policies, including the limited capacity of drug courts and related interventions to reach all of the nonviolent offenders who don’t need to be in prison. It turns out that the great majority of drug-involved offenders aren’t eligible for the most touted current programs.

Unfortunately, the problems go deeper, too, and arise from the above formula for M. When we simulate the potential impact of such policies on samples of real prisoners, we find that these same policies applied to every eligible offender would still have surprisingly limited impact on the prison population. Prisoners helped by this policy would experience pretty short sentences anyway. So keeping them outside prison accomplishes less than you might think.

The current population is a length-biased sample of a highly varied population of newly-sentenced prisoners. So the parameters μ and M—the mean prison terms among all entering prisoners and the current population—look quite different. Reducing μ just doesn’t reduce M by very much in a varied population. We found that it is surprisingly hard to reduce prison populations by more than five or ten percent if one focuses on the most obvious populations of nonviolent drug-using offenders. One really needs to so other things, including addressing excessively long sentences imposed on older prisoners, and doing a better job on individuals who are now supervised in probation or parole.

For more on these policy issues, see here.

(My next post will apply the same principles to challenges President Clinton faced in welfare reform. It includes some cool graphs produced by my masters’ degree students. If you happen to hold stereotypes about quantitatively-challenged social worker-types, you’re welcome to perform these Monte Carlo simulations yourself….)

*If you are comfortable with calculus and probability, see Karlin and Taylor’s beautifully executed textbook: A first course in stochastic processes, p. 195.

Author: Harold Pollack

Harold Pollack is Helen Ross Professor of Social Service Administration at the University of Chicago. He has served on three expert committees of the National Academies of Science. His recent research appears in such journals as Addiction, Journal of the American Medical Association, and American Journal of Public Health. He writes regularly on HIV prevention, crime and drug policy, health reform, and disability policy for American Prospect, tnr.com, and other news outlets. His essay, "Lessons from an Emergency Room Nightmare" was selected for the collection The Best American Medical Writing, 2009. He recently participated, with zero critical acclaim, in the University of Chicago's annual Latke-Hamentaschen debate.

11 thoughts on “#Nerdpost: Length-biased sampling”

    1. The average offender is very different from the typical offender. The typical prisoner – if we consider all the people who ever go to prison – serves a short sentence. But the prison population at any one time depends on the average sentence, and that’s determined mostly by a relatively small number of people doing very long sentences.

      Similarly, the typical drinker (50th percentile) takes something between a drink a day and a drink a week. But the average drinker takes about two drinks a day, because the average is dominated by the small proportion of heavy drinkers.

      Strategies aimed with the typical person in mind won’t much change the average. But if we look at the average, we’ll get a very distorted view of most of the population.

  1. So the culture of prisons is dominated by the long-stay hard cases. In a way, the old cons are the tenured professors of criminality.

  2. @Modaca,

    In its essence, what Harold is saying is that definitions matter. If we think of the distribution of prison sentences it matters whether we think of this distribution as being:

    (1) Over the population of sentenced individuals; or,
    (2) Over the population of currently incarcerated persons.

    Under some simplifying assumptions that make a mathematical analysis of the issue possible, it turns out the relationship between the two distributions can be teased out. The mean of distribution (2) is the mean of (1) multiplied by 1 + the relative variance of distribution (1). What this means is that under some circumstances decreasing the mean of (1) can result in an increase in the mean of (2).

    Surveys operate on the basis of accessible populations, and so often (and incorrectly) we survey a population like (2) when we are really interested in population (1). So, for example, if we are interested in the length of unemployment periods, I can think of at least three ways to conduct such a survey.

    One would be to take a sample of persons filing for unemployment compensation and following them until they are re-employed. A second would be to sample persons currently receiving unemployment benefits. A third would be to sample persons going off unemployment at a particular point in time.

    The distribution we would be estimating from the surveys would be different, because the populations are different. The second method would be subject to Harold’s length-biased sampling issue, the first would not be subject to length-biased sampling. But the first would have only censored results for a long period of time. All would be subject to the issue of censoring because unemployment benefits expire at some fixed period of time. And of course policy changes will modify these distributions.

  3. Thanks to Harold (and Dennis). This was a really interesting post for a math-challenged moral philosopher. After 2 readings and Dennis’ help, I’ve got it. I love the posts on Mark’s blog when social scientists explain some of the deeper concepts and findings of their fields, and then relate these to important social issues like imprisonment and possible reforms. (That’s something I’m personally interested in.) Even the best newspapers like the NY Times rarely print things that are this theoretical.

  4. Thanks. This ties with something I learned while working on drug policies. Ninety-odd per cent of some drugs are consumed by a small percentage of the using population. If you want to cut drug problems, focus on these people. In the same way, most thinking cops (a small group, admittedly) know that most serious crimes are committed by a small number of habitual criminals. Focus on these, and crime drops.

  5. Another side of this is that the short-stay (or usually-low-consumption) population can give rise to huge amounts of volatility. If some external event (why, yes, this was just the St. Patrick’s Day weekend) synchronizes the behavior of a whole bunch of median consumers/offenders, the various supply and cleanup systems have to be sized to deal with the peak. And that in turn creates all manner of staffing and population-management issues.

    “You survey lung cancer patients about their satisfaction with care. Your results may be misleading because you can only ask the question of patients well-enough to take your survey. The really sick patients and those who have died aren’t available to you.” This struck a chord with me — about two weeks after my mother died, she got a letter from the hospital asking how well she was satisfied with the quality of her care. (I bet some careful rewording to explain intention-to-treat statistics could produce useful results here.)

  6. This is a case of the stock-flow distinction, isn’t it? Stock concepts (the prison-population) are heavily influenced by the long stayers. Flow concepts are heavily influenced by the frequent fliers. In the case of prisons, you’re right that length of sentence (rather than the chance of being imprisoned at all) will drive the stock of prisoners and getting rid of short sentences won’t actually reduce the overcrowding much. But (at least in the UK) a lot of problems beyond just the numbers of warm bodies the system has to accommodate are linked to short-sentence prisoners.

    Perhaps the flow of people in and out of prison is a better marker of the social experience of incarceration?

Comments are closed.