Why testing has failed our schools
Pressure to raise achievement test scores has become the driving force behind what is taught and how. Worse, the results from these tests reveal only a fraction of the overall picture of whether students are learning—and whether teachers are effective in their jobs, says Daniel Koretz.
As one of the nation’s foremost experts on education testing—and the Henry Lee Shattuck professor of education at the Harvard Graduate School of Education— Koretz says the whole idea of test-based accountability has failed.
In The Testing Charade: Pretending to Make Schools Better (University of Chicago Press, 2017), Koretz says the pressure to produce results often leads to outright cheating.
“Test-based accountability has become an end in itself in American education” Koretz says, “unmoored from clear thinking about what should be measured, how it should be measured, or how testing can fit into a rational plan for evaluation and improving our schools.”
I have to say, reading your book made me angry at times.
Well, thank you. I have to admit, I wrote it in part because I was angry. Probably came through.
You’ve been studying testing for about 30 years, and you tried to warn of the misuse of standardized tests early on.
It’s depressing. My concern was that it was clear that using tests in this way wasn’t going to work, that it was going to generate serious side effects, and that, ironically, it was going to undermine the value of tests.
In 1987, I published something that said that one of the consequences of high-stakes testing is that test scores will be less valuable. And that has been true.
That’s Campbell’s Law that you wrote about in the book.
Campbell was one of the founders of the Scientific Study of Program Evaluations. He said, “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”
We’ve known for a long time that this would happen. The majority of people who study incentive systems in a variety of fields know and accept that it happens. That makes it even more striking that people in education have pretended that it doesn’t.
Hence the word “charade” in your book’s title.
Exactly. It’s pretense. Everywhere I turned, people in education were acting as though this weren’t true, even though they had evidence to suggest it was. They found it inconvenient, so they pretended it wasn’t there.
You write that standardized testing has value, but not in the way we’re currently using it.
Standardized tests were designed, primarily, to monitor and diagnose. And they work pretty well for that.
For instance, at the same time that racial differences in achievement have been narrowing, socioeconomic status differences—more specifically, the gap between rich and poor kids—have been growing. And we know that because of standardized tests.
So we need to measure what matters in multiple ways. How can we get more valid assessments?
What I always say to people is, even before you get to that question, ask yourself what you want to see when you walk into a classroom. If you can’t describe what you want teachers to do, and what you want them to do better, there’s no point in even starting to design an accountability system.
When I visit a classroom and see kids absolutely bored to tears, then I know that the classroom is not acceptable, even if the test scores are good.
When I observe classrooms, at the top of my list is student engagement. I want to see them motivated. I want to see them committed to real intellectual work, as opposed to just sitting there doing chores, doing worksheets and so on.
Different teachers produce that kind of engagement in different ways. I want to see what teachers are producing in the classrooms, and not just a standardized list of things that they should be doing.
There are many teachers and district leaders who now equate good instruction with just getting better scores.
It’s even worse than that. In many places teachers are told that good instruction is test prep.
This is something that students here at Harvard, who are former teachers, have been telling me for years—that it’s not just that they were told explicitly to do test prep to raise test scores. They were told that doing test prep is good instruction. And they were often given these little menus for how to do it: Do step one, do step two, do step three.
So they don’t recognize the contrast between good instruction and test prep. They’ve never seen anything else.
The clearest example I can give comes from a former inner city teacher. She said, “I know that what we were doing was not what the kids really needed. But you tell me—how else was I going to meet the targets that were set for me?”
That, in two sentences, sums up what’s wrong with this system. Instead of teachers asking, “What can I do to improve their learning” it’s, “What can I do to get their numbers up?” And in many cases that has turned out to be simple cheating.
Let me stress one thing in the book that doesn’t get as much play as I would like. Cheating is just a canary in the coal mine. There’s a huge amount of test prep that produces fraudulent gains in performance. And if you define cheating that way, it’s everywhere.
It’s not just in Atlanta, where people changed answer sheets, or El Paso, where they made kids disappear from the enrollment rolls. It’s all over the United States.
Are you hopeful we can change it?
I think we can. But it’s going to be a tough slog, because there are a lot of people who stand to lose if we back away from this. You’ve got all the people who pegged their careers on these reforms. Suddenly they’re going to be in an awkward spot.
There are probably close to 1 million teachers who will need to be completely retrained because they’ve never seen any other type of instruction.
A former teacher asked me, “Where is the point of leverage to start changing things?” And I think the answer has to be partly at the top.
When you say at the top—do you mean the federal level?
Right now, many people tend to point fingers at the federal level because the feds took control of this with No Child Left Behind. But keep in mind, the origin of this heavy-handed test-based accountability really began at the state level. Some of them will really have to change their tune as well. We don’t know exactly what recipe will work.
There are huge differences in the challenges facing, for instance, teachers in low-income schools and high-income schools. So why would we insist that they do things in exactly the same way? It doesn’t make sense to me that they would.
Some schools have large populations of kids who don’t speak English well, and who aren’t in a position to pick it up quickly on their own. Seems to me they should be organized differently.
Unfortunately, many families without resources are relying solely on schools to educate their kids, and they’re not being served.
That’s one reason I got so fed up with the pretense. The motivation for a lot of the reforms of the last 15 years was to begin to reduce inequities. That was why, when NCLB was in Congress, people like Ted Kennedy and George Miller supported a Republican bill. They wanted to pressure schools to improve things for the kids at the bottom.
There’s just no evidence that it has worked.
I believe we have an ethical obligation to admit that we failed and we have to look for another approach. And we can argue about what that other one should be, but it’s not this one. We’ve don