Evaluating Hillary on Teacher Evaluations

I’ll admit, I had a bit of an emotional rollercoaster ride on Twitter on Monday.

It all started when I saw this tweet, which made me angry:

Hillary Clinton is planning a huge break with Obama on education: https://t.co/95vi3UPyqW pic.twitter.com/9DZAgdhWCN

— Vox (@voxdotcom) November 17, 2015

However, when I saw this tweet, I was hopeful that maybe Hillary’s “huge break” with Obama’s education policies wasn’t all it was cracked up to be:

Top Clinton aide: Hillary is “often confused” https://t.co/BcozyByBHQ pic.twitter.com/IxryabpWCz

— The Hill (@thehill) November 16, 2015

But my mood plummeted once again when I saw this tweet from Hillary’s long-time friend and supporter, Randi Weingarten:

Clinton says ‘no evidence’ that teachers can be judged by student test scores https://t.co/vTwYthqurf

— Randi Weingarten (@rweingarten) November 17, 2015

That last tweet, of course, refers to Hillary Clinton’s comments during a recent roundtable discussion hosted by the American Federation of Teachers (AFT) in New Hampshire. Clinton briefly touched on the topic of teacher evaluations when asked for her thoughts on the increased emphasis around testing under the Obama Administration:

“I have for a very long time also been against the idea that you tie teacher evaluation and even teacher pay to test outcomes. There’s no evidence. There’s no evidence. Now, there is some evidence that it can help with school performance. If everybody is on the same team and they’re all working together, that’s a different issue, but that’s not the way it’s been presented…”

Over the past few years, U.S. Secretary of Education Arne Duncan has prodded states to adopt teacher evaluations that incorporate value-added measures (VAM) of student performance by tying them to both Race to the Top and waivers from No Child Left Behind. Teachers unions and their supporters have pushed back against the policy, claiming that VAM is unreliable and is strongly influenced by factors outside of the classroom, such as poverty.

So is Hillary right to be skeptical about incorporating using students’ test results in teacher evaluations? Here’s a few things to keep in mind about recent efforts to judge teacher performance:

I. Nobody evaluates teachers on the basis of test scores alone

Reform critics often make the claim that teachers are losing their jobs based on the outcome of a single test. That’s simply not the case. To my knowledge¹ (and please correct me if I’m wrong), there isn’t a single state that evaluates its teachers on the basis of test scores alone. In most places, test results only account for a fraction of a teacher’s overall evaluation score, which otherwise rely heavily on the results of classroom observations by school administrators.

Moreover, teacher evaluation laws in most states stipulate that teachers can only be terminated after they’ve been rated “ineffective” on two or more annual evaluations. So, to be clear: No, teachers are not being fired on the basis of a single test score.

Actually, this should say, "We don't grade teachers solely on tests." — Actually, this should say, “We don’t grade teachers solely on tests.”

II. Clinton is correct that pay-for-performance schemes haven’t worked

It seems logical to imagine that school districts could be able to increase achievement by offering performance bonuses to teachers whose students beat expectations on annual standardized tests. However, Clinton is correct that numerous studies have shown that pay-for-performance schemes don’t lead to gains in achievement.

That being said, it’s important not to conflate performance bonuses with efforts to differentiate teacher compensation based on performance or other factors. Collective bargaining agreements often involve a fixed salary scale in which teacher pay is based on credentials and years of service. Unions have resisted efforts by some districts to adopt a more flexible compensation approach which can take into account other factors like prior performance, subject matter expertise, etc.

III. Studies show high value-added teachers make a difference

In contrast to Clinton’s assertion, there is evidence that teachers with high VAM scores have a long-term impact on student success. One of the most commonly cited studies on VAM comes from the economists Raj Chetty, Jonah Rockoff, and John Friedman, who tracked one million students from an urban school district from the 4th grade to adulthood to evaluate the accuracy of those measures, as well as determine whether high value-added teachers improve students’ long-term outcomes.

In terms of VAM’s accuracy, Chetty, Rockoff, and Friedman’s research determined the following:

“We find that when a high VA teacher joins a school, test scores rise immediately in the grade taught by that teacher; when a high VA teacher leaves, test scores fall. Test scores change only in the subject taught by that teacher, and the size of the change in scores matches what we predict based on the teacher’s VA.”

The three economists also revealed that high-value added teachers had a significant, long-term impact on their students – an impact that persisted well into adulthood:

“We find that students assigned to higher VA teachers are more successful in many dimensions. They are more likely to attend college, earn higher salaries, live in better neighborhoods, and save more for retirement. They are also less likely to have children as teenagers. Teachers have large impacts in all the grades we analyze (4 to 8). Teachers’ impacts on earnings are also similar in percentage terms for students from low and high income families.”

Chetty, Rockoff, and Friedman showed high-value added teachers have a demonstrable impact on students.

IV. States haven’t always used VAM in productive ways

I support rigorous teacher evaluations that incorporate student performance measures, but some states have used VAM in ways that are ultimately counterproductive to the effort to ensure that every classroom has an effective teacher. Since a majority of teachers are assigned to grades or content areas that are not assessed by state standardized tests, that means they don’t receive VAM scores every year. This poses a dilemma for policymakers who want to include an objective component like VAM into every teacher’s evaluation, but how do you do that for a music teacher?

Some states (Florida and New Mexico being two such examples) have opted to include a school-wide student growth measure in the evaluations of teachers in non-tested grades and subjects. Essentially, this means those teachers are being evaluated, in part, on their students’ performance in other classes. Not only is this approach illogical and fundamentally unfair, it also gives ammunition to those opposed to evaluation reform who argue that the system is rigged against teachers.