Grading

Among the elements of standard pedagogy that could use the most work, in my view is grading. Conventionally, this is done on specific exercises like an exam with a red pen, something I have never seen an adult to do another’s work, and by subtracting points from a preset total (usually 100). The latter violates the most fundamental principles of quality assurance, which prescribes using errors only to direct attention (in groups) to a productive process, never to assign blame to individuals. Deming is quite clear about this, discouraging not only punishment of individuals but individual rewards for excellence. He points out that if you single out the best salesman among twenty for a big prize, you will (i) usually reward random variation, and the winner’s performance next year will be disappointing as it falls back to the mean (ii) create one winner and nineteen resentful losers (iii) provide no guidance for anyone about how to improve performance.

I have learned to grade everything by adding points for successes; I’m much more concerned that students will be afraid to use the stuff in the course in original ways for fear of making a mistake than that they will do something not quite right as alums and cause world war III. It seems to help the affective environment of the course a lot. I can always regularize the raw point scores on an exercise with z-scores.

We also tend to give grades that count some sort of terminal event, like an exam or a paper, very heavily, which provides feedback at a time when it can’t be used to improve performance, at least in the course itself. Grading a term paper draft, with extensive comments, is worth ten of grading the final product in terms of learning, as far as I can tell.

Actually, I’m more agnostic about how, and how much, educational practice should mirror work life than Mark thinks (see below), but I admit to being uncomfortable about how profoundly different they are. A typical classroom, for example, builds the skill of being in a room with a known authority who knows the truth, something successful organizations do not reward.

I think grading on some sort of a curve is unavoidable, partly because my exercises tend to be openended and I don’t really know what should get an A until I see how my students do with them, partly because classes vary from year to year in a way that seems to confound the central limit theorem, at least if each year’s enrollment is any kind of a sample from the same distribution. I’m happy to have the fat part of the distribution move up and down year to year so comparable performance more or less gets the same grade each year even if the class is full of stars.

Here’s a “curved” grading scheme that students seem to have stopped grousing about, invented many years ago over several years of improvisation and experiment with Bob Leone. I count collaboration and group work, including class discussion very highly, as much as 40% of a course grade in some cases. I also need to undermine very ingrained instincts to flatter me and protect my ego, and anyway I can’t observe what I care about, which is students’ success in making each other smart, so I don’t feel I can properly grade class participation and don’t want the students to think there’s much payoff in showing off for me. From the start, I make a lot of fuss about getting students to pay attention to each other, putting a mug book on the web site in the second week, insisting they bring name cards to class every day, and learning their names in the first few weeks. Then, all the students grade each other on a scale of 1-5 on the criterion “X’s contribution to my learning in this course” (which obviously means different things to different students) three times during the semester.

I publish the results of the first two rounds, alphabetized within terciles or quartiles, so no-one is at the top or bottom of the class, but these rankings don’t count for the final grade. The third time, I (and the TAs, if any), grade the student at the bottom of the distribution (or the second-lowest, in case of a hopeless outlier) on an absolute scale, and everyone else gets grades from there up to A. I make sure from the start that everyone notices the devious incentives: if people lower down the scale get their hands up and play, and if people at the top get together with lower-scorers and encourage them to overcome shyness and do their reading, everyone can get an A for this element. The undesirable incentives to scramble over the backs of your fellows to succeed are at least highly diluted.

I’ve tried the experiment of grading the students for class participation myself before I see their ‘votes’ three or four times, and between a quarter and third of the class always wound up quite far from where I would have put them, usually higher. I infer having them grade each other, aside from its modeling of what I want them to do and giving incentives to do it, obtains information I couldn’t otherwise obtain.

Author: James Wimberley

James Wimberley (b. 1946, an Englishman raised in the Channel Islands. three adult children) is a former career international bureaucrat with the Council of Europe in Strasbourg. His main achievements there were the Lisbon Convention on recognition of qualifications and the Kosovo law on school education. He retired in 2006 to a little white house in Andalucia, His first wife Patricia Morris died in 2009 after a long illness. He remarried in 2011. to the former Brazilian TV actress Lu Mendonça. The cat overlords are now three. I suppose I've been invited to join real scholars on the list because my skills, acquired in a decade of technical assistance work in eastern Europe, include being able to ask faux-naïf questions like the exotic Persians and Chinese of eighteenth-century philosophical fiction. So I'm quite comfortable in the role of country-cousin blogger with a European perspective. The other specialised skill I learnt was making toasts with a moral in the course of drunken Caucasian banquets. I'm open to expenses-paid offers to retell Noah the great Armenian and Columbus, the orange, and university reform in Georgia. James Wimberley's occasional publications on the web

9 thoughts on “Grading”

  1. Having been in a couple of classes with Mike, I think i can comment on this method. I may be overly critical of some fellow students, but I think the scheme pushes too many students to the top of the grade distribution. This is because there is a total number that a grader must reach when summing the 1-5's of all graded students. In a class where there are a lot of high performers, this is good. In a class where there are, what I see as, a few students whom I've learned a lot from and a lot I haven't (because of attendance or failure to speak, no matter how much silence there is after a question is asked or a point is raised)it is difficult to allocate points. In the end, those that speak a lot (regardless of the quality of their contribution) seem to be rewarded, even by me, because I just don't know who else to give the extra points to. It is hard for me to know whether this is a common feature of class participation in general, as I do not know who gets what grades when professors determine the score. It seems, however, that students, particularly those who don't pay that much attention in class, just use amount of time one speaks as a proxy for quality of learning. But, maybe, far be it from me to tell other students who contributes to their learning. In either case, this problem seems it could be corrected by an astute professor. Higher required number of points if it is felt that the class has been "good", lower total number if it has been "bad".
    I would also just say that this scheme only works when the professor is committed to it, like Mike is. I have had other professors try a variant of it and without specific explanation, a "mug-book" and the 2 practice rounds, I think it is just too far from what students are used to to work effectively.

  2. Some of the "information [you] couldn't otherwise obtain" may consist of your students' popularity. I'm not sure how you could correct for students' tendency to rate more highly other students they like for reasons unconnected to the quality of contributions in class.
    Both the situation in which many students know each other from outside of class and the one in which almost no students know each other have pitfalls. In the former case, pre-existing friendships can skew the data, whether consciously or not. In the latter, students have an incentive to make themselves likeable in class, not just useful. Neither effect is necessarily bad; knowing how to make oneself well-liked is a good skill to have, friendship among students strongly promotes their ability to learn from each other, and the random average of many subjective evaluations may be fairer on the whole than one subjective evaluation. Still, there may be a distorting effect at work.

  3. I do not know precisely what you teach, but it seems to be a course in which creative thought and idea flow are important. I have taught such courses and like your approach, but I have more often taught informational, skills acquisition courses (specifically introductory chemistry). Call them sorting courses, but they are far less able to be structured as you propose.
    Your salesman analogy is apt for any business profession in which individual application of known skill sets is important. Implicitly, your salesmen all know the product and certain corporate strategies such that an excellent year is the result of individual prowess (which should be reproducible) or luck. The 19 disgruntled salemen who did not win the award the one year due to luck have hope that luck will be with them the next year. It would seem that losing the sales award to a salesman with true prowess year after year would be more disspiriting.
    Compare the salesman, an employee with some independence, to the corporate accountant, whose success is improved by not making mistakes. An accountant who is uncertain of how to, shall we say account, has a short tenure. Accountants may interact with one another, but ultimately they must know what to do. Grading an accounting course should provide a concise measure of error, because error is what is to be avoided in that profession.
    The requirements of excellence for majors and non-majors is a serious difficulty in the fundamental courses. This is often, but not completely efficiently, recognized in many colleges by having separate course tracks for majors. Getting back to introductory chemistry, out of a class of 100, maybe 5-10 are truly interested and motivated. The others will be nurses, or engineers or farmers who have an underlying feeling that the subject is not relevant to their goals. I defer all arguments about the true meaning of education and the conflicts between shaping careers versus shaping minds.
    The luxury of a totally involved class, where individual success can be tied to group success, is rare. You are not often encumbered, it would seem, with students enrolled in the "let's get this over with" school of learning.
    It is a pleasure to read the ideas on this blog on the subject. Anyone who teaches should read the posts and think about their situation.

  4. One small point re: extensive comments on papers:
    I agree. I give extensive, typed comments on drafts, and many students have told me that it's the most helpful feedback they got during their entire college careers. One paper returned with extensive comments–explaining exactly how they're going wrong, exactly what they're doing right, and with a few of their own paragraphs re-written for them to show them how it's done–is worth 100 papers returned with 'vague' and 'thesis?' scrawled in the margins.
    Problem is it takes time. HUGE, MASSIVE amounts of time, even for 4-6 pp. papers. If you can do one 5 pp. draft this way in one hour, that's pretty good. If you're lucky enough to have only 20 students in the class, that's half a normal work week right there, on top of everything else that has to be done that week. And then there's the final version of the paper to be evaluated.
    It's still worth it, I think…but egad, the effort!
    (Add to this that, after all that work, I've had students look right past the 3-4 pp of typed comments to the grade, then look up and say "I need to talk to you about this grade"…)

  5. I need to assure everyone that I didn't arrange Avi's comment as evidence that [at least some of] my former students have learned to push back and treat me like a grownup :-).
    Popularity and usefulness aren't entirely uncorrelated. Still, I make a point to the students that part of being a responsible person is learning to distinguish between someone's performance at a task and how much you like him, and that telling people untruths that will make them happy in the short run isn't the way most people want their friends to treat them. In the end, these assessments are imperfect and I will do something better as soon as I know what it is.
    My introductory chemistry class in college was taught by William N. Lipscomb and I still remember it as being off the scale on the "creative thinking and flow of ideas" dimension. For lots of cool stuff about teaching science as though it's about ideas, with real research results to validate the practices, see http://www.colorado.edu/physics/phet/web-pages/in
    If Winston can provide "the most helpful feedback students got during their entire college careers" in an hour, or even two, it sounds like a pretty good use of time; less than two working weeks and you've done it for a class of seventy. Now we have to figure out how to stop doing all the things that don't lead to learning.

  6. I use a red pen on other adults' work all the time.
    I am an editor.
    My favorite feedback line from a mentor was (in red ink) "This is crap"
    didn't point out why exactly it was, but assumed i knew why, and how to fix it myself. Depends on the field, but english professors got you read for this critiqing.

  7. I've found that students like critiquing a class and they like providing feedback and suggestions for improving a course, but they dislike being asked to grade themselves or others. They respond to that as an abdication of the professor's responsibility and an unwanted burden.

  8. Nancy is generally right: students dislike this assignment and will sometimes come up with lots of reasons why they shouldn't have to do it. Like most people (me, certainly) they would rather criticize things for which they have no responsibility, like curriculum, the professor's classroom style, etc. etc. Actually, most adults dislike doing performance evaluation, especially of peers (I certainly dislike it), and most organizations have elaborate routines to mechanize, bureaucratize, and eviscerate it, generally in the interest of maximizing everyone's short-term comfort. Without rehearsing all the arguments, my view is that the students' view is totally understandable affectively, but they are mistaken on the merits. There are all sorts of things we should do in life that are disagreeable, but we learn to do them and get on with it because that's what it means to be a grownup. This is one.

  9. Well, I teach introductory statistics. I have thrashed around trying to find a method to force my undergraduate methods students to:
    1) Wake up in class;
    2) Read the textbook for which they paid an outrageous price;
    3) Ask questions about the material.
    Graduate students aren't nearly so much a problem, they've mostly already figured out that they'll have to use 'this stuff.'
    I think Michael's suggestion has real merit, and I'm going to try it next fall. I usually make class participation worth 10% of the grade, but for this trial I think I'll bump it to 15%. In my scheme that's equivalent to the mid-term exam.
    BC

Comments are closed.