Why is it so hard to increase learning?

The single most important obstacle to improving student learning in college (I don’t know enough about K-12) is our terrier-like obsession with assessment and our faith in punishment and reward as the only motivators of any use. It’s all summative evaluation, that does nothing for performance because it’s delivered too late and in an affectively toxic context (like the critique you give a student paper after it’s turned in and the semester is over), and mentor/downward- rather than peer/360° framed. What improves quality is (1) formative, and evaluation is the least of it, and (2) collaborative. These are bromides of industrial quality assurance, but in higher ed, for teaching, we are still in the dark ages of the 1970s when GM thought it could make quality cars by doubling inspections and having a larger reject lot at the end of the assembly line where defective cars could be triaged into “scrap, rework, or ship”.

It’s also way over-focused on classroom performance, and what the prof does rather than what her students are doing. Learning happens while reading, writing, doing problem sets, group projects…not even mainly when being lectured at. Still, the most important single thing any dean or chair who cares about student learning can do is simply to set up a schedule with dates and names, whereby every prof visits five class sessions of colleagues every semester.

Here are some propositions implicit in the discussion of this issue, and in our practice, that need a lot more skeptical examination:

(1) Improving performance in an affectively fraught, improvisational, creative enterprise can only be accomplished with objective measurements of performance (thus, student test score mania). This is why we can tell from real data–total sales in money, square feet, or downloads–that Thomas Keane is a better painter than Vermeer, and that Britney Spears is a better singer than Renee Fleming. If you know Mariah Carey has a wider range than Ella Fitzgerald, there’s nothing to be learned by actually listening to them!  We all know that artists actually learn nothing about how to compose or paint in studio courses without written exams, nor just looking at/listening to each others’ work and talking about it. If they would only learn to use colorimeters and frequency analyzers, we could get some good art!

(2) Knowledge is facts, teaching is telling, and learning is recall (David Cohen), hence fact-recall tests are the unique measure of learning.

(3) The key, maybe the only, element of value creation is avoiding mistakes. Wagner’s greatness is owing to his low error rate; he never violated the voice-leading and harmony prescriptions of a standard textbook. This is why the best teachers always grade by taking off points from 100 for mistakes, instead of adding on points for successes.

(4) Teaching effectiveness is a trait, so a dollar spent assessing it (so you can promote good teachers and fire bad ones) is worth a hundred trying vainly to improve it. Also, firing and raises are quick, cheap, and allow us to get back to writing that journal article, something we know we’re good at. Quality assurance for teaching is complicated, time-consuming, subjective, and messy, so offload that stuff to unpaid staff (students via post-course evaluations) and maybe, checklist rubric scoring. Squeaky chalk? -10.

(5)Even if you don’t believe (4), and want to waste everyone’s time making teachers better at what they do,  money and fear are the unique motivators for teachers. Evidence: when have you seen a grade school teacher spend his or her own money for classroom supplies?

(6) Coaching is OK for people of modest intellectual chops performing mindless physical tasks. Like football players, opera singers, and, um, heart surgeons (cf A. Gawande). Collaboration and peer advice is essential for research but for reasons much too arcane and technical to actually explain, useless for teaching. In fact while peer review of research is the gold standard of academic progress, for professors to visit each others’ classrooms or kibitz on curriculum, assignments and homework is not only useless but a moral outrage, a violation of academic freedom. I know, it’s paradoxical: if you’re not a professor, you can’t understand. Deal with it.

(7) Coaching is always downward hierarchically, thus occasional visits by senior faculty to assistant profs’ classes at promotion time. We know from sports that the first requirement of a track coach is that he can outrun all his sprinters. Nadia Boulanger taught many of the most important 20c composers because she was the greatest composer of them all, in fact it’s a constant struggle to get her off symphony programs to make room for a little Aaron Copland.

Author: Michael O'Hare

Professor of Public Policy at the Goldman School of Public Policy, University of California, Berkeley, Michael O'Hare was raised in New York City and trained at Harvard as an architect and structural engineer. Diverted from an honest career designing buildings by the offer of a job in which he could think about anything he wanted to and spend his time with very smart and curious young people, he fell among economists and such like, and continues to benefit from their generosity with on-the-job social science training. He has followed the process and principles of design into "nonphysical environments" such as production processes in organizations, regulation, and information management and published a variety of research in environmental policy, government policy towards the arts, and management, with special interests in energy, facility siting, information and perceptions in public choice and work environments, and policy design. His current research is focused on transportation biofuels and their effects on global land use, food security, and international trade; regulatory policy in the face of scientific uncertainty; and, after a three-decade hiatus, on NIMBY conflicts afflicting high speed rail right-of-way and nuclear waste disposal sites. He is also a regular writer on pedagogy, especially teaching in professional education, and co-edited the "Curriculum and Case Notes" section of the Journal of Policy Analysis and Management. Between faculty appointments at the MIT Department of Urban Studies and Planning and the John F. Kennedy School of Government at Harvard, he was director of policy analysis at the Massachusetts Executive Office of Environmental Affairs. He has had visiting appointments at Università Bocconi in Milan and the National University of Singapore and teaches regularly in the Goldman School's executive (mid-career) programs. At GSPP, O'Hare has taught a studio course in Program and Policy Design, Arts and Cultural Policy, Public Management, the pedagogy course for graduate student instructors, Quantitative Methods, Environmental Policy, and the introduction to public policy for its undergraduate minor, which he supervises. Generally, he considers himself the school's resident expert in any subject in which there is no such thing as real expertise (a recent project concerned the governance and design of California county fairs), but is secure in the distinction of being the only faculty member with a metal lathe in his basement and a 4×5 Ebony view camera. At the moment, he would rather be making something with his hands than writing this blurb.

14 thoughts on “Why is it so hard to increase learning?”

  1. I wonder sometimes how much of our frustration with secondary education, for example, is a side-effect of the negative bias in grading/assessment. A student starts with a hypothetical 100% and from there can only fail — losing points progressively with every mistake. For example, a vocabulary test measures whether a child knows the words on a given list. They might know dozens of harder words that aren't on the set list, yet forget the definition of one that is on the list. FAILURE!! The positive tail is truncated and knowing more than the minimum required is effectively penalized.

  2. Looking at this from an engineering perspective, objective measurements of performance are absolutely vital. You can't control what you don't measure. That's basic.

    But you have to get the measurements back in time to do something in response to them. Ideally in real time, certainly before the process is complete. A measurement that you don't get back until after you're done is almost perfectly useless. Grading students after they've completed the course doesn't provide the students with useful feedback. It *might* be useful feedback about the performance of their instructors, but the instructors are resistant to it being used that way.

    My wife is currently taking a chemistry class at the local technical college. Most of the homework is online, evaluation of each answer is instant, and you get three tries on most problems. (Only the questions with binary answers are one try, for obvious reasons.) THAT is proper feedback!

    That's also practically impossible to implement unless you either have an extremely low teacher/student ratio, or automated grading. And extremely low teacher/student ratios are financially infeasible, if most courses are going to be taught hands on by teachers.

    I think you need to divide the subject matter into "Can be taught in an automated fashion." topics, which get largely automated, with use of human teachers being very limited, and use the human teachers for the subjects where automated teaching/grading is simply impossible as yet. That way the teachers can be deployed more efficiently, which would permit lower teacher/student ratios on selected subjects.

    At least, that's how it looks from an industrial engineering viewpoint. Automate wherever possible, so that your limited human resources can be deployed where they're actually needed.

    1. Objective measurements of performance are indeed absolutely vital. And you're right–getting the measurements fast enough that you can put them to good use is essential.

      But the bit you're skipping–and it seems to me, the point of the article–is that it's very easy to say "We must measure performance. We can measure X. Therefore, X is performance."

      Like you, I expect, I do engineering for a living. And much of what I do is about realtime monitoring of the performance of complicated systems, in a reporting structure that takes pride in being data-driven. And one of the things we've learned through hard experience is that it's very easy to focus on the wrong data, so long as those data are easy to gather and, at a glance, appear to be at least passably relevant.

      High-stakes testing of students to measure teacher performance looks to me an awful lot like measuring the temperature of car exhaust to understand engine efficiency–it's easy data to gather, and it seems like there ought to be some kind of relationship. But if you rely on them to the exclusion of all else, it turns out that those data might be worse than useless.

      1. I understand that judging teachers by students' grades is problematic. I think there's a place for it, but it can't be done in a simplistic manner. You'd have to account for the nature of the incoming students, for one thing.

        But, at some point you do have to judge the teachers, and some measure of how much their students learn would appear to be relevant. The students learning is the desired end product, after all.

        1. No, you should never "judge the teachers". It's totally immoral to judge another human being. You mean, "judge the teacher's performance at a particular task", and one of the reasons performance evaluation is the most incompetent and corrupt management function in most organizations is that it feels like the former formulation and is affectively poisoned from the start…especially if we use that careless language.
          There's more: what constitutes "an evaluation"? I get a student paper and I give it (i) a grade of B+ and (ii) some comments about how it could be [even] better. Which of those signals is useful to the student? If the paper gets a grade of A, wouldn't (ii) still be valuable? Except in the very few cases of failing (firing staff), isn't the absolute-scale evaluation pretty much useless; does (i) actually add anything to (ii)?

      2. Well put. Your point about deciding X is performance because we can measure it is spot on. Of course, teachers' performance isn't necessarily reflected in the grades, per se, but in the grading. The question is not "are the students getting good grades?" but "is the students' work being graded accurately and in a way that provides them with what they need in order to learn?"

        I speak as someone who spent many years training teachers to teach writing, primarily at the college level, but we did some work with secondary teachers. The end-product grades the students get are to a large degree a matter of what they bring to the class, whether it be nutrition, time pressures, how much sleep they're getting, whether they're working on a topic they really *want* to engage with, etc.

        It takes teachers a long time, relatively speaking, to do a good job grading written work and it takes supervisors a long time, again, relatively speaking, to assess the teachers' grading. You don't just crunch the numbers and decide you're done, although we did plenty of numbers-crunching. It's time- and labor-intensive–one of our main techniques was to examine students' portfolios of graded work over the semester, looking both at the work and the grading practices. I don't see a way of getting past that without ignoring everything we know about best practices and coming up with a lot of results that look impressive but are at best useless and at worst seriously misleading.

    2. I"m a teaching faculty (non-PhD, no research requirement) in computer science at a public urban university. At least in CS, the big push is toward the so-called "flipped classroom" model. Yes, you identify what they can get from reading the book, online lectures, etc.. Provide the lectures, and online quizzes. Review & retake them as many times as you want, for a small amount of points. (If we make them worth 0, no one does them–undergraduates seldom do optional.) Our textbook publisher, like most others in this field, has a website with practice problems. Again, for a relatively small part of the grade, do 85% of them by the end of the week, doesn't matter how many attempts you make–the point is practice. This is the CS equivalent of a piano student playing scales and basic drill.

      Class is spent working practice problems, or looking at sample problems with things I know typically cause problems for students. There are clickers & various online ways for student to 'vote' for what they think the correct answer is–a typical pattern is: Here's what we want to do, which of these 5 code snippets do that? Click in, look at how the class 'voted,' then discuss with your neighbor & click in again, then we all discuss. Usually they home in on the right answer with a chance to think about it–if not, great, we've found something that they're having trouble learning, and I know I'm not wasting my time lecturing on things they could easily read in the book. The cutesy ed-theory slogan that gets bandied about is "Not the sage on the stage, but the guide to the side."

      Schools that are doing this are seeing better performance and better semester-to-semester retention. More students are finishing on time. I've seen a notable increase in the number of A's & B's,a corresponding drop in C-'s and Ds. (F's are mostly people who never come to class; that's consistent regardless of what happens in the classroom.)

      For each course, we've identified some specific things we want students to learn, how we're going to assess it (project, final exam question, whatever) and track performance. This is mostly in response to our accreditation requirements.

      And I've been informed by colleagues in the College of Arts & Sciences that while that may work for Engineering courses (where our CS program is housed), it couldn't possibly work for A&S courses. I have yet to hear a reason as to why this is so, but I have been assured by many A&S faculty that it is.

      1. Hooray for the flipped classroom. The longest lecture I've given in the last five years is maybe ten minutes, when the students get stuck on something I can fix quickly. I miss it terribly; it was a real ego trip to be the only one talking, saying the smartest things I knew, while a roomful of students were furiously writing everything down as though I were Moses coming down from the mountain with tablets.
        Are your language teachers still telling their students French; is it working?
        I might note that the flipped classroom is the universal pedagogy in every context in which the task is for the students to acquire skills (other than stenography): painting/drawing, playing a musical instrument, sailing a boat, acting, welding…and has been for millenia.

      2. What is now called the flipped classroom is the traditional way at least some Humanities classes are taught: students do the readings outside of class and come in prepared to discuss, debate and otherwise engage with the material.

        But I'm having a little trouble figuring out how what you describe would work for, say, teaching Hamlet.

  3. I am a mid-career professional in the education field and returned to graduate school to get my PhD in history. I was shocked at how much you discuss in this essay can be applied to graduate education. I'll defend soon and I'll have a credential, but I really didn't learn anything I didn't teach myself on my own.

  4. 4(a) If firing individual teachers or giving them a raise is cheap, then turning over the entire adult complement of a school is even cheaper, and definitely better for students, who do not benefit at all from establishing longterm relationships with the adults at their schools.

    also 1(a) Performance improvements are always monotonic, and any decline in performance or in the slope of improvements must be subject to immediate sanctions to get the system back on track.

  5. I would first ask you to define the problem. Just out of curiosity… what makes you think kids in college aren't learning "enough?" Is this based on your own students?

    For myself, I have to question the utility of group projects. They are (maybe) good at teaching cooperation… and if you are lucky to get a good group, the process can be quite enjoyable. But I do not know that I "learned" more, and my grades were always based on my individual assignments anyway. Maybe they shouldn't be graded? It seemed to me that professors assumed that the people getting high individual scores must of necessity be doing the heavy lifting in groups. Which I question. And if the students rate each other, that's messy too. Maybe we shouldn't worry about grades so much? But then, again, how do you measure "learning" and whether "enough" of it is happening?

    Overall though of course you make a lot of good points… which it would be nice if f.e. law professors would read. (Almost a complete waste of time, law school. Pedagogically and career-wise except for the funneling… oh, but wait…) So again, what is the problem we are trying to fix? And (to me) most importantly, *what* is our motivation? That is, i know *yours* because you seem convincingly concerned with things with inherent value… but other people working on this "problem?" I'll have to wait and see. I think the subjective motivations people bring to policy making are determinative, which is probably egregiously obvious thing to say, except… that no one ever talks about it. And I think *this exact thing* is why k-12 ed "reformers" are so very very toxic. They come from a bad place. Sorry to say it but the proof is in their works and their actions.

  6. My definition of "not learning enough" is that they could be learning more if we could get better at what we do with reasonable effort. See my comment to Brett above regarding absolute-scale measures.

Comments are closed.