Among the elements of standard pedagogy that could use the most work, in my view is grading. Conventionally, this is done on specific exercises like an exam with a red pen, something I have never seen an adult to do another’s work, and by subtracting points from a preset total (usually 100). The latter violates the most fundamental principles of quality assurance, which prescribes using errors only to direct attention (in groups) to a productive process, never to assign blame to individuals. Deming is quite clear about this, discouraging not only punishment of individuals but individual rewards for excellence. He points out that if you single out the best salesman among twenty for a big prize, you will (i) usually reward random variation, and the winner’s performance next year will be disappointing as it falls back to the mean (ii) create one winner and nineteen resentful losers (iii) provide no guidance for anyone about how to improve performance.
I have learned to grade everything by adding points for successes; I’m much more concerned that students will be afraid to use the stuff in the course in original ways for fear of making a mistake than that they will do something not quite right as alums and cause world war III. It seems to help the affective environment of the course a lot. I can always regularize the raw point scores on an exercise with z-scores.
We also tend to give grades that count some sort of terminal event, like an exam or a paper, very heavily, which provides feedback at a time when it can’t be used to improve performance, at least in the course itself. Grading a term paper draft, with extensive comments, is worth ten of grading the final product in terms of learning, as far as I can tell.
Actually, I’m more agnostic about how, and how much, educational practice should mirror work life than Mark thinks (see below), but I admit to being uncomfortable about how profoundly different they are. A typical classroom, for example, builds the skill of being in a room with a known authority who knows the truth, something successful organizations do not reward.
I think grading on some sort of a curve is unavoidable, partly because my exercises tend to be openended and I don’t really know what should get an A until I see how my students do with them, partly because classes vary from year to year in a way that seems to confound the central limit theorem, at least if each year’s enrollment is any kind of a sample from the same distribution. I’m happy to have the fat part of the distribution move up and down year to year so comparable performance more or less gets the same grade each year even if the class is full of stars.
Here’s a “curved” grading scheme that students seem to have stopped grousing about, invented many years ago over several years of improvisation and experiment with Bob Leone. I count collaboration and group work, including class discussion very highly, as much as 40% of a course grade in some cases. I also need to undermine very ingrained instincts to flatter me and protect my ego, and anyway I can’t observe what I care about, which is students’ success in making each other smart, so I don’t feel I can properly grade class participation and don’t want the students to think there’s much payoff in showing off for me. From the start, I make a lot of fuss about getting students to pay attention to each other, putting a mug book on the web site in the second week, insisting they bring name cards to class every day, and learning their names in the first few weeks. Then, all the students grade each other on a scale of 1-5 on the criterion “X’s contribution to my learning in this course” (which obviously means different things to different students) three times during the semester.
I publish the results of the first two rounds, alphabetized within terciles or quartiles, so no-one is at the top or bottom of the class, but these rankings don’t count for the final grade. The third time, I (and the TAs, if any), grade the student at the bottom of the distribution (or the second-lowest, in case of a hopeless outlier) on an absolute scale, and everyone else gets grades from there up to A. I make sure from the start that everyone notices the devious incentives: if people lower down the scale get their hands up and play, and if people at the top get together with lower-scorers and encourage them to overcome shyness and do their reading, everyone can get an A for this element. The undesirable incentives to scramble over the backs of your fellows to succeed are at least highly diluted.
I’ve tried the experiment of grading the students for class participation myself before I see their ‘votes’ three or four times, and between a quarter and third of the class always wound up quite far from where I would have put them, usually higher. I infer having them grade each other, aside from its modeling of what I want them to do and giving incentives to do it, obtains information I couldn’t otherwise obtain.