Beyond Satisfactory Evaluation: 2011

Thursday, March 10, 2011

The Scoreboard

I recently came across this table. It is a table generated by a secondary school that the school sends to all colleges and universities to which the school's students apply. These are the cumulative GPAs for the Class of 2012 at this school. This is an actual thing, that is, I did not make this up.

Why would a secondary school do this? Does this advantage students? What might a college admissions officer make of this?

What about the students? This is accessible on the school's web site (that's where I got it). If I'm a 12th grader at this school, then I'm in that list somewhere. What does that make me think? Is school a competition? Is this the scoreboard? Am I a winner? A loser?

I have no idea if other schools do this, but my guess would be that they do.

What do you think?

Thursday, February 17, 2011

I am the Gatekeeper.... are you the Keymaster?

The role of the letter grade as gatekeeper is yet another reason I am looking for something better.

At the end of my honors 9th grade math class today, one of my students asked me about moving up next year to the accelerated 10th grade math course (yes, we have a level above "honors"). This student, let's call him Ricardo, is a new 9th grader, that is, he was not in my school's 8th grade last year (my school is an independent K-12 school). He spoke of how when the year began, he felt mostly lost because the math curriculum at his last school was far behind ours. So, he explained, though he began the year in the C+/B- realm, his grades have steadily improved and so has his confidence and understanding of the material.

I agree with Ricardo's assessment 100%. However, his semester grade was a B and he earned a B on the semester exam. Our criteria for a student moving from the honors course to the accelerated course is a 95% and a student ought to have earned that without extraordinary effort. Essentially, for the student who wants to move up, the honors course should be easy.

At this point, it does seem like Ricardo is understanding the newer topics with more ease than was the case in the first quarter. In fact, his level of performance at the moment might be at the necessary level for the move he wants to make. But, is that enough? From a pedagogical point of view, if he shows a mastery of the course throughout the second semester, should that make up for his difficulties in the first semester? As the topics in the course build off of each other, should he show such a mastery in the second semester, it would show that he has now mastered the first semester topics, too.

However, from the grade's perspective, it may be too little too late (to use a boring cliche). His semester "average" was an 85. Thus, even if his average in the second semester is 100, he would average 92.5 for the year. He would fall short of the 95 benchmark. And, from his own admission, the first semester was not easy for him.

How much importance should really be put on that grade?

I do think our school is flexible enough to look beyond the letter grade and take the student's full performance into consideration. Also, the student could potentially make the move from honors to accelerated between the 10th and 11th grade year.

That's only one small example of how letter grades serve as keys to opening doors in education. The "best" example of this role of the letter grade is college admissions. Ricardo probably sees getting an A in honors math as not enough. Instead, he wants to get an A in accelerated math so that he can say he took the most rigorous courses at our school to impress colleges.

Do you see what I see? If this is Ricardo's thinking (or his parents' thinking), Ricardo is treating his math courses as steps on the way to college. The content of the courses is of no relevance. All that matters is that he does well in the hardest classes. Does he enjoy math? Does it spark his imagination? Does he explore topics in math in greater depth in ways that are not graded? Maybe, but the system does nothing... NOTHING... to promote this love of learning.

Monday, February 7, 2011

Keeping up with the Joneses

In math, we say that 45 > 42 because 45 represents a larger quantity than 42. So, it seems fair to say that if Peter gets 42 points on a test and Claire gets 45 points on the same test then Claire did better than Peter. Right?

Hmmm... Suppose there were 50 points possible on my test. Suppose there are 10 questions and each is worth 5 points. Then perhaps Claire got 9 problems completely correct and 1 problem completely incorrect, and Peter got 1 point off of 8 problems. So, who did better? Is one large error better than several smaller errors? What seemed so clear now becomes murky.

In the "safe" world of adding up points earned and dividing by the points possible, Claire gets an A- (90%) and Peter gets a B (84%). But, if the above were true, it seems challenging to me to try to decide who did "better".

Keeping with the above scenario, Peter would have two "perfect" problems and 8 "imperfect" problems. What if Peter got full points on the problem on which Claire earned zero points? So, he understands that idea way better than Claire. That would also mean that Claire understands 8 questions better than Peter, but only in a small way on each.

Suppose, instead, the test were out of 214 points? How much of a difference is 3 points?

Comparing students to students is the real issue here. We, as teachers, do this all the time. The SAT does it. Colleges do it. Students do it. Why? To see who is better, or worse, or the same as us? How does that help my learning?

Let us keep Peter with his 84% and Claire with her 90%. What do these scores tell us about how they are doing overall? Not much. Those scores provide us only some measurement of how they did on one individual test. Perhaps Peter has never scored as high before. Perhaps Claire has never scored as low before. Now what? Does that diminish the value of Claire's score? Does that augment Peter's? Might we now say that Peter has, in a way, done "better" than Claire?

No, I go for Option Q2. With such scores on a test, do not be concerned with such questions as "Who did better on this test: Peter or Claire?" Rather, ask the question "What does Peter's performance on this test tell us about how Peter is doing at learning the ideas being tested?" Ask the same question of Claire. But don't bother with trying to make sense of comparing the two.

Tuesday, February 1, 2011

If I must give a grade...

As promised, here is a recipe I have followed in previous years for figuring grades on math tests.

I begin by correcting the tests against my answer key (also known as a mark scheme). Each problem has a defined number of points (or marks) and each point is earned for either an accurate value or a a correct method. Many problems have multiple steps, some necessitate multiple methods, thus a single problem can have multiple method points and multiple accuracy points. I use the idea of follow through. This means, if a student errs in part (a), but then properly uses the incorrect answer of (a) to solve part (b), then they are not penalized in part (b) even though their solution may be incorrect.

After I correct all of the tests, I order them with the greatest number of points on top. I then go through the tests and try to determine the quality of the work produced. I begin by setting the grade boundaries. So, let's say a test has 35 possible points and the top score was a 33. I might start looking at tests with 29 points and see if I think they meet the "A" criteria. (My school has a one-sentence description of "A" work. I have expanded greatly on this to explain what I expect students to show in my math class.) Having years of experience, I have my criteria pretty well set in my head, so it isn't too hard for me to think to myself, "This 29 feels like an A-." If there are multiple 29s, then I read over one or two others to see if they also feel like an A-. Then, I look at the 28s. Do they feel like an A-, too, or are they more a B+. Then, I do that with the 27s. At some point, the quality of the work clearly no longer meets the "A" criteria. So, I might decide that 28 is the bottom of the A's. Then, I do the same thing to determine the B/C boundary.

Once I have found the A/B boundary and the B/C boundary, I calculate a best fit line to convert the number of points on the test into the 100-90-80-70-60 scale. Why? Because I have to give grades in that system. That's why. So, I then look at what that does. I look to see that it accurately represents the quality of the students' work. If necessary, I look to see what that does to the C/D boundary. Does that boundary seem reasonable? If anything seems out of whack, I tweak my boundaries. By "out of whack", I mean, are grades being assigned fairly? For example, a 15 might work out as a C-, so I ensure that the quality of work for a 15 is truly a C-. If I think, no, it's a D+, or, no, it's a C, then I adjust my best fit line.

This may seem like a laborious process, and it certainly is in comparison to the standard math teacher system (add up points earned and divide by points possible to get a percentage that converts to a letter), but I used it for several years and became quite adept at it.

I like this system better because grade boundaries are not pre-determined. With the other system, I found that sometimes a test was easy, and mediocre work was awarded a B while in other cases excellent work was awarded a C+. With my system, grades are awarded based on the quality of the work actually produced. With the other system, I have to be a fortune teller and predict what excellent work will look like.

If this system sounds at all familiar to you, then you probably have taught at an IB school. The International Baccalaureate (IB) uses a system much like this to assign grades to their exams. I know, because I used to work at an IB school and the math department graded all of its tests in this manner (minus the converting of the score into the 100-90-80-70-60 scale).

It's not a perfect method. Not all 27s on a test are the same, for example. And, it is quite different from other math teachers at the school where I teach. So, many students are confused by it and some don't bother trying to understand it. But, I truly believe it is in the interest of the student to do it this way.

This year, I am teaching a 9th grade course and, obeying a request from higher ups, I have simplified my process. I often found that when I did the above conversion, a zero score on my tests would convert to a 50. This makes sense to me because in the 100-90-80-70-60 scale, an F technically has a range of 60 points. Why? Who knows. To me, a range of 10 for each grade makes sense. So, making a zero convert to a 50 makes sense.

So, this year, all of my students begin with a 50 and then the points they get on a test are used to figure what fraction they get of the remaining 50. For example, let's say a test has 35 points and a student gets a 25. 25 out of 35 is 71.4%. So that is the percent of 50 they get in addition to the 50 they started with. 71.4% of 50 is about 36. So, a 25 would convert to an 86 (36+50=86).

Okay, I know you're likely confused, so comment away and I will try to explain it better...

Friday, January 21, 2011

Using rubrics - The letter grade on steroids

In my last post, I began discussing alternatives to the standard letter grade and mostly wrote about using narratives for evaluating work. Tonight, I'd like to discuss the use of criteria and grading rubrics.

Criterion-based grading, for me, is a large improvement on the standard math teacher grading system (i.e. add up points earned and divide by points possible to get a percentage that converts to a letter). I use criterion-based grading exclusively in my Senior Stats course. Here is an example of grading rubric I used this year.

Here is how I use my rubric. I read through the project as a whole, making notes and comments as I go. I then evaluate each criterion, one by one. Each criterion has various achievement levels. Some go from 0 to 2, others from 0 to 4. In order to reach a level 2, a project must satisfy all of levels 1 and 2. To reach level 3, a project must satisfy all the requirements of all 3 descriptions.

I attempt to write the descriptions as plainly as possible and in a logical progression so that attaining a higher level should show an improved understanding. For example, take my "Displays" criterion. To reach level 1, a student must only create one correct display that is relevant to the project topic. Level 2 is attained if a student makes multiple types of displays. So making 2 different histograms would be insufficient. Making a histogram and a boxplot would satisfy the description. In each case, the displays must be made "correctly," meaning, in part, that they should be accurate and be labeled sufficiently. To reach level 3, the displays must "communicate well" and have a "high level of accuracy." Essentially, this is a way for me to distinguish between the student that makes a few sloppy displays and the student who makes a few excellent displays. Finally, level 4 requires a certain degree of sophistication within the displays. This means that they are all effective and lead to insightful analysis. There is a level of complexity within the displays.

So, I evaluate the project against all of the criteria. I write narrative comments for each criterion to explain why a certain level was awarded. I note errors as well as strengths. This example will give you a taste of what I try to do each time (you can zoom in on the image to make it large enough to read).

In the end, however, by using achievement levels, what I've really done is assign a variety of grades to a project. In a sense, I've given a student a "letter" grade for their introduction, another for their data, one for their calculations, and so on. So, I feel like I'm not exactly moving away from the letter grade entirely. And, given where I teach, in the end, I do have to assign an actual letter grade to the project. So, I do my voodoo and do that. More on my voodoo in another post. I'm tired. Good night.

Wednesday, January 5, 2011

If not a letter grade... then what?

The basic theme of this blog is my disdain for the letter grade. As a professional educator, evaluation of students is a big part of my life. So, if I were to not use letter grades, then how would I evaluate my students?

The first idea that comes to mind is a narrative comment. Allow me to consider a hypothetical situation to better explain what I mean. I teach math, so students take many tests (and no, I do not like tests either, but that seems to be a discussion for a different blog... at the very best it's a tangent, and being a math teacher, I should know all about tangents, both the mathematical kind as well as those meandering ones such as when I start going off about tangents when I am supposed to be considering evaluation... )

Anyway, let us pretend that I give my ninth graders a test and the test includes problems on algebra, some on trigonometry, a few on matrices, some graphing, and some transformation problems. As this is math, I can create an answer key and I can correct a student's work against this answer key and I can quickly see a student's mistakes. I do not merely consider a student's answers; I also look at the supporting work, that is, the steps a student took to find, for example, the inverse of a matrix or to solve an equation. So, regardless of a student's answer, I can see the methodology and, really, the thinking a student utilized to arrive at their answer. Thus, I can evaluate not only the correctness of their answers but also the appropriateness of their method and their ability to communicate this method.

So, I go through and correct their test. I mark which answers are right, which are wrong, and I indicate where their method is appropriate and where it is not. Like most math teachers I know, I can award points for correct answers as well as for supporting work. So, then I add up all of these points and arrive a number. Now, many math teachers would know the possible number of points one might earn on a test, and at this point they would divide the number a student earned by the total possible to get a percentage which they would then convert magically into a letter. (See my last post to read some of my opinions on that.) But, the point of this post is to consider what to use in place of such a letter grade.

Back to the student's test... they have earned a number of points. Some of these points are for correct answers and some are for correct methods. Now, that number of points itself is feedback for a student and is thus a form of evaluation. Moreover, if I indicate clearly where they earned points and where they did not, then that is another level of evaluation. A student can see if they used the proper method and if their use of that method led to a correct answer. That, much more than a letter grade, tells a student if they have correctly learned how to find the inverse of a matrix or if their equation solving techniques are strong.

So, without any narrative comment (and without a silly letter), I have already provided a student with copious information regarding how they did. But, I have not really passed any sort of judgment yet as to my opinion on the quality of their work.

So, now the narrative comment becomes useful. I could convey to a student if I think they did well or not. I could judge their performance compared to earlier work. I could specifically state if they were strong on matrices, but weak on algebraic equations. I could clearly point out whether they chose the best methods or not and how accurate their answers were. I could also judge if they have been learning as much as I think they should have up to that point. Additionally, I could outline areas that they ought to focus on as they strive to improve.

Such a narrative could fill half a standard sheet of paper or more. Such a narrative could take 10-15 minutes to write. Such a narrative would provide far more useful feedback to a student than any letter grade could ever hope to.

But, what is the cost of such a narrative, filled as it would be entirely with my opinions? Does my opinion on the quality of the work really matter? Couldn't my corrections stand alone as all the evaluation I do and couldn't I then let a student make up their own opinions on the quality of their performance? That is a discussion for a future post.