With our government struggling to rethink its landmark No Child Left Behind legislation, I am reminded of a great article co-authored by Richard Rothstein for Ed Week titled Proficiency for All is an Oxymoron. In it, Rothstein and his colleagues argue that our nation’s focus on ensuring that every child is "proficient" in basic subject areas like math and reading by 2014 may actually harm the quality of instruction in our nation’s classrooms. They write:
The federal education legislation does not define proficiency, but refers to the National Assessment of Educational Progress. Although the Bush administration winks and nods when states require only low-level skills, the law says proficiency must be “challenging,” a term taken from NAEP’s definition. Democrats and Republicans stress that the No Child Left Behind law’s tough standards are a world apart from the minimum competency required by 1970s-style accountability programs.
But no goal can be both challenging to and achievable by all students across the achievement distribution. Standards can either be minimal and present little challenge to typical students, or challenging and unattainable by below-average students. No standard can simultaneously do both—hence the oxymoron—but that is what the No Child Left Behind law requires.
What makes "proficiency for all" even more intimidating is that our only uniform measure of assessment—the National Assessment of Educational Progress—sets almost impossible standards that even countries recognized as world leaders in education would struggle to meet. As Rothstein and his colleagues cite:
We can compare performance in top-scoring countries with NAEP’s proficiency standard. Comparisons are inexact—all tests don’t cover identical curricula, define grades exactly the same, or have easily equated scales. But rough comparisons can serve policy purposes.
On a 1991 international math exam, Taiwan scored highest. But if Taiwanese students had taken the NAEP math exam, 60 percent would have scored below proficient, and 22 percent below basic. On the 2003 Trends in International Mathematics and Science Study, 25 percent of students in top-scoring Singapore were below NAEP proficiency in math, and 49 percent were below proficiency in science.
On a 2001 international reading test, Sweden was tops, but two-thirds of Swedish students were not proficient in reading, as NAEP defines it.
Could this conversation be another example of oversimplification—where policymakers and parents lose the complexity of conversations about education and instead make decisions based on cliches and catch phrases that seem to make sense but are built on fragile understandings of what is possible? After reading Rothstein’s description of how NAEP’s proficiency standards were set, you’d have to answer yes:
NAEP officials assembled some teachers, businesspeople, parents, and others, presented these judges with NAEP questions, and asked their opinions about whether students should get them right. No comparison with actual performance, even in the best schools, was required. Judges’ opinions were averaged to calculate how many NAEP questions proficient students should answer.
From the start, experts lambasted this process. When officials first contemplated defining proficiency, the NAEP board commissioned a 1982 study by Willard Wirtz, a former U.S. secretary of labor, and his colleague Archie Lapointe, a former executive director of NAEP. They reported that “setting levels of failure, mediocrity, or excellence in terms of NAEP percentages would be a serious mistake.” Indeed, they said, it would be “fatal” to NAEP’s credibility. Harold Howe II, a former U.S. commissioner of education responsible for NAEP’s early implementation, warned the assessment’s administrators that expecting all students to achieve proficiency “defies reality.”In 1988, Congress ordered NAEP to determine the proficient score. Later, U.S. Sen. Edward M. Kennedy’s education aide, who wrote the bill’s language, testified that Congress’ demand was “deliberately ambiguous” because neither congressional staff members nor education experts could formulate it precisely. “There was not an enormous amount of introspection,” the aide acknowledged.
Once achievement levels were set, the government commissioned a series of evaluations. Each study denounced the process for defining proficiency, leading to calls for yet another evaluation that might generate a better answer.
The first such evaluation, conducted by three respected statisticians in 1991, concluded that “the technical difficulties are extremely serious.” To continue the process, they said, would be “ridiculous.” Their preliminary report said that NAEP’s willingness to proceed in this way reflected technical incompetence. NAEP fired the statisticians.
The trend towards holding schools accountable for student achievement was an essential—and long overdue—change in American education. For too long, we were willing to overlook the glaring gaps that existed between students who were succeeding and those who struggled as long as schools looked good from the outside. But accountability must be based on realistic standards and goals in order to retain credibility. A program that seems—in Rothstein’s words—to be "divorced from reality" cannot possibly benefit a nation committed to genuine improvement.
If President Bush is sincere when he says that "Measurement is not a tool to punish. Measurement is a tool to correct and reward," then he will advocate for corrections to the flawed measurement tools being used to judge schools under the No Child Left Behind legislation.