HIGH STAKES TESTING AND SOCIAL PROMOTION
Robert M. Hauser, Ph.D.
Chair, Committee on Appropriate Test Use
Board on Testing and Assessment
National Academy of Sciences/National Research Council
Vilas Research Professor of Sociology
Center for Demography and Ecology
The University of Wisconsin-Madison
Committee on Health, Education, Labor, and Pensions
United States Senate
APRIL 29, 1999
Good morning, Mr. Chairman and members of the Committee. My name is Robert Hauser. I am Vilas Research Professor of Sociology at the Center for Demography and Ecology, University of Wisconsin-Madison. During 1998, I served as Chair of the Committee on Appropriate Test Use of the Board on Testing and Assessment at the National Research Council. The Research Council is the operating arm of the National Academy of Sciences, which was chartered by Congress in 1863 to advise the government on matters of science and technology. I was elected to the Academy in 1984. I am delighted to be here today, and request that my testimony be entered into the record.
The Committee on Appropriate Test Use prepared its report, High Stakes: Testing for Tracking, Promotion, and Graduation, in response to Congressional interest in issues related to the Clinton administration’s 1997 proposal for Voluntary National Testing. The Committee’s charge was "to recommend appropriate methods, practices, and safeguards to assure that existing and new tests ¼ are not used in a discriminatory manner or inappropriate for student promotion, tracking, or graduation, and existing and new tests adequately assess student reading and mathematics comprehension in the form most likely to yield accurate information regarding student achievement of reading and mathematics skills" (P.L. 105-78, Sec. 309). The NRC panel was a diverse group of 15 scholars from all across the country.
The panel took no position about the value of voluntary national testing for its stated purposes, to tell American students, parents, and teachers how well they are doing relative to high national standards. However, we recommended strongly against relying on such tests to make "high-stakes" decisions about tracking, promotion, and graduation of individual students. The report was released in September of 1998 and now is available as a National Academy book. The book has a lot of useful information about proper test use -- information that is by no means limited to the proposed national tests (which are still under development).
I would like now to summarize some of our key findings about test use and then focus on data and issues relating more specifically to the debate over ending "social promotion."
The Committee applied an analytical framework that weaves together three crucial concepts:
• * test validity (how well a test covers the knowledge and skills it is intended to cover)
• * attribution of cause (whether student performance reflects knowledge and skills acquired after proper instruction), and
• * consequences of test use (whether as a result of testing students receive the best available educational treatments).
Although it is clear that tests can and often do have high stakes for teachers, administrators, schools, districts, or states, the Committee interpreted its mandate more narrowly, and focused directly on the use of tests to make decisions about tracking, promotion, and graduation of individual students.
Based on this framework, which is explained in detail in our report, we developed a set of principles to guide appropriate test use. Among them, the following are most germane to today’s hearing:
* Any particular test has validity only in relation to specific uses.
• * Tests are not perfect, but neither are the alternatives to tests.
• * No high-stakes educational decision about a test-taker should be made solely or automatically on the basis of a single test score; other relevant information should also be taken into account.
• * Neither test scores nor any other kind of information can justify educational decisions that are not beneficial for students.
• * Tests should be used for high-stakes decisions only after students have been taught the knowledge and skills on which they will be tested.
One of our strongest recommendations is that "Accountability for educational outcomes should be a shared responsibility of states, school districts, public officials, educators, parents, and students. High standards cannot be established and maintained merely by imposing them on students" (p. 5).
In conducting our study, Mr. Chairman, we reviewed an abundant literature and sought relevant data to inform our work. With respect to the use of tests in tracking decisions, for example, we found little empirical data on the specific effects of testing, but considerable information that challenges the efficacy and fairness of placing students in typical low-track environments that are starved of intellectual or social stimulation. Hence, we conclude that using tests cannot justify these kinds of placements.
This is not to say that all forms of tracking are bad for students. Our findings were based on the actual and typical, not the ideal. But research evidence based on actual experience in the schools should inform new policies.
With respect to retention in grade, the research evidence is overwhelming: Simply holding back students who have not achieved to the appropriate standard does not work. Now, no one favors promoting students who have not mastered the work of one grade and who are clearly not ready for work in the next grade; but flunking them, holding them over for a repeat year, and simply assuming that this will help them overcome their educational deficits, is ineffective and may even aggravate an already untenable situation. Among our findings:
• * Students who have been held back typically do not catch up;
• * Low-performing students learn more if they are promoted -- even without remedial help -- than if they are held back;
• * Students who have been held back are much more likely to drop out before completing high school;
• * The long-term costs of holding students back are high to students and to school systems. The cost to a student is a year of their life. The costs to schools are financial and appear also in classroom management, social interaction, and the burden on teachers and administrators; and
• * The negative effects of holding students back are often invisible to those who make retention decisions because they occur many years later.
Many people believe social promotion is the norm, and we have no doubt that many students are promoted who have not mastered grade-level material or reached high standards. But how widespread is social promotion? There is little direct evidence about that question. Ironically, much of the evidence about the educational harm of retention in grade is available precisely because we already do a lot of it. Our analysis of data from the Current Population Surveys of the US Bureau of the Census showed that:
• * In 1970, almost all six-year olds were in the first grade. By 1996, 18 percent of six-year olds were enrolled below the first grade. Part of that change is due to holding children back in kindergarten, though the data do not tell us exactly how much.
• * Nationally, among children who entered school in the late 1980s, 21 percent were enrolled below the usual grade at ages 6-8; 28 percent were below the usual grade at ages 9-11; 31 percent at ages 12-14; and 36 percent at ages 15-17. Since this does not count kindergarten and the later grades of high school, at least 15 percent of children -- and probably more than 20 percent -- are held back at some time in their childhood.
• * Minorities and poor children are the most likely to be held back: All groups of children start school at about the same age. By ages 15-17, about 45 percent of male African-American and Hispanic youth are below the expected grade level for their age.
Let me emphasize that, in questioning some positions in the social promotion debate, we are most emphatically not advocating for the status quo nor for continued neglect of our lowest-performing students. Our committee, which focused on testing as a tool for high-stakes decisions about students, did not evaluate all of the available data on retention in grade. However, the evidence we reviewed leads us to conclude that simply holding children back in school is poor educational policy. It will make much more sense to identify children’s learning problems early on, and to invest in appropriate strategies to remedy those problems before the only choices are flunking or social promotion. There is good evidence that smaller class sizes, better-trained teachers and principals, a challenging curriculum, high expectations, good after-school programs, and summer school can make a large and positive difference in opportunity and achievement for all children. Flunking kids does not help them.
The data I have presented tell us about what has happened in the past, not what need happen in the future.
• * We must ask ourselves, what are the likely consequences, both immediate and in the long-term, of well-intentioned efforts to raise educational standards?
• * We need strong evidence that reforms will work before we put them in place on a large scale.
• * We need a commitment to measure reforms and their consequences as they take place.
• * We must remember what the problem is as we seek to solve it. The problem is not social promotion; it is low academic achievement.
In closing, I return to overall findings of the NRC report. When used appropriately, high-stakes tests can help promote student learning and equal opportunity in the classroom by defining standards of student achievement and by helping school officials identify areas in which students need additional or different instruction. When used inappropriately, high-stakes tests can undermine the quality of education and reduce opportunities for some students, especially if results are misinterpreted or misused, or students are relegated to a low-quality educational experience as a result of their scores.
I welcome questions, Mr. Chairman, and thank you again for this opportunity to share information on the important issue of the uses of tests to bolster our children’s education.
National Research Council
1999 High Stakes: Testing for Tracking, Promotion, and Graduation. Jay P. Heubert and Robert M. Hauser, eds. Committee on Appropriate Test Use, Board on Testing and Assessment, Commission on Behavioral and Social Sciences and Education, National Research Council. Washington, DC: National Academy Press.
Robert M. Hauser is the Vilas Research and Samuel A. Stouffer professor of sociology at the University of Wisconsin at Madison. His current research includes the Wisconsin Longitudinal Study, a long-term study of 10,000 high school graduates which is providing new data for studies of aging, the life course and social stratification. He has also studied trends and differentials in school enrollment, aspirations, and attainment from the 1940s to the 1990s. Dr. Hauser is a member of the National Academy of Sciences, of the American Academy of Arts and Sciences, and of the National Academy of Education. He has previously served on studies by the National Academy of Sciences of the status of Black Americans and of the measurement of poverty. He has served as Director of the Institute for Research on Poverty and of the Center for Demography and Ecology at the UW-Madison. He is currently a member of the Board on Testing and
Assessment at the National Research Council. Dr. Hauser received a B.A. degree in economics from the University of Chicago and M.A. and Ph.D. degrees in sociology from the University of Michigan.