NBER Reporter: Fall 2001
What Might School Accountability Do?
Education is currently at the forefront of the nation's political agenda: everyone, regardless of political persuasion, wants to see an improvement in the performance of U.S. schools. This consensus ends abruptly, however, when it comes to determining how to effect such a change in performance. One popular approach is to increase the accountability of schools to the public, by assessing schools on the basis of improvements in students' performance on standardized examinations and by offering remedies, such as increased choice (either within the public sector or through vouchers for private schools), reconstitution, or closure, in the event of persistent identified failure of a school to improve. School accountability is the centerpiece of President George W. Bush's education reform proposal, and in dozens of states, other accountability measures have been proposed or implemented.
This past summer, both the U.S. Senate and the U.S. House of Representatives passed education reform bills that set stringent standards for schools to meet. Both bills require states to test students in the third through eighth grades and to identify schools that, on the basis of the test results, fail to make "adequate yearly progress." Although the definition of adequate yearly progress is still ambiguous, both bills include provisions requiring that students who attend schools that fail to make such progress be granted additional public-school choice. In both bills, schools that persistently fail will be subject to increasingly severe sanctions, including reorganization or closure of the school. Both the House and the Senate bills have significant teeth: intriguing recent work by Thomas J. Kane, Douglas O. Staiger, and Jeffrey Geppert suggests that the vast majority of U.S. schools would face at least moderate sanctions under either bill. (2)
Much of my current research centers around issues related to school accountability. My work on accountability follows several strands. One strand, still in its infancy, involves school responses to accountability systems. In joint work with Cecilia Rouse, Dan D. Goldhaber, and Jane Hannaway, I am studying how Florida schools have changed their instructional policies, practices, and allocations of resources within and between schools as a result of increased accountability. In other work, which is being conducted in Florida, Virginia, and elsewhere, I am investigating noninstructional responses to accountability systems. Several specific projects along these lines study whether increased scrutiny has led to the removal of "problem" students from threatened schools (either by relabeling them as "disabled" or by active redrawing of school boundaries), as well as whether schools respond to accountability systems by manipulating school nutrition programs (that is, changing school menus during testing periods to increase nutrients that might stimulate short-term performance) in attempts to boost test performance. A second strand of research considers the often unintended consequences of design issues associated with school accountability. The specific design of an accountability system may have dramatic consequences for school choice, for instance, or for other factors not directly related to education. This essay focuses on my early work on this second strand of research.
There are a number of ways to measure a school's performance. One approach is to rate schools on the basis of some value-added measure, following the same students over time to gauge improvements in their performance. A handful of states, including North Carolina now and Florida next year, have implemented or are planning to implement a value-added system for school accountability. This type of solution is popular with economists because it attempts to deal with an important and vexing identification problem: whether high-performing students are doing well because their school is excellent or because of other factors, for example, family background. Of course, the same question can be asked about low-performing students. By following the same students over time, the analyst can isolate more clearly--although still imperfectly--the component of students' test scores that is most likely attributable to the school.
An alternative approach uses so-called "status" scores to measure school performance. Status scores can take many forms. They may simply be average test scores, or they may measure the fraction of a school's students who achieve some level of proficiency. Likewise, they may assess schools on the basis of their performance in a single year, or they may rate schools on changes in average test scores from one year to the next or the percent of their students who attain competency. Florida's current accountability system and those of several other states follow this latter model. The appeal of status scores--particularly those measures that are based on the fraction of students who attain some threshold--comes from the popular notion that schools should bring all students to an adequate level of performance.
The accountability system that is currently being discussed in Congress is a hybrid of the status and value-added approaches. While the House and Senate education bills vary in their definition of adequate yearly progress, both propose to assess schools on the basis of improvements in their status scores--here, the fraction of students who attain a specific level of proficiency--from one year to the next. The House bill requires that a school be on a pace to have 100 percent of its students proficient within 12 years and rates each school on the basis of its making sufficient annual gains in proficiency to reach this goal. The Senate bill requires that schools increase the number of students in every subgroup who meet proficiency requirements by at least 1 percentage point. In another subtle distinction between the House and Senate bills, the Senate would assess schools on a rolling-average basis rather than solely on year-to-year changes in the percentage of proficient students. As mentioned above, Kane, Staiger, and Geppert estimate that even under the less stringent Senate version, an overwhelming majority of schools would fail to make adequate yearly progress as it is currently defined.
Marianne Page and I report that in identifying which schools are failing to meet some standard, design matters quite a bit. (3) One obvious way that design issues make a difference involves the stringency of the standards. The Kane, Staiger, and Geppert study makes this point clear. But even if the stringency of standards is removed from the picture, the way school progress is defined can have a profound impact on which schools are identified as failing. The correlations among schools' rankings based on various measures of status scores from a single year of tests are very high--nearly perfect. However, the correlations between those rankings and the ones based on changes in test scores from one year to the next--using either value-added measures, as mentioned above, or changes-in-proficiency measures like those proposed in Congress--are extremely low. Using data from Florida, Page and I find that there is virtually no relationship between any measure of status scores and any measure that we could construct of scores that follow individual students. Stated differently, we find that status scores assess schools nearly randomly--if the goal of an accountability system is to measure a school's contribution toward student performance. If the goal instead is to identify which schools are moving toward proficiency, then the use of status scores as a measure of performance is clearly more appropriate.
Even were one to determine that status scores are appropriate for measuring school performance, however, as Congress has done, potentially serious implementation issues remain. Page and I also find that rankings based on changes in year-to-year scores in the same grade level--the approach proposed in the Congressional bills--bounce around considerably: the correlation between schools that make yearly progress along these lines in one year and schools that make yearly progress in the next year is strongly negative. Kane and Staiger find similar patterns in North Carolina, and they make the point that evaluating schools on the basis of this type of measure of progress leads to arbitrary assignment of school rankings. (4) Together with my recent findings, this suggests that assigning school grades in the manner Congress proposes may lead to effectively random assignment of school ranks, or at best to rankings that are largely unrelated to the school's likely contribution to student performance. Arbitrary performance measures, with the attendant rewards and punishments, may impose unnecessary risk on the stakeholders, including school administrators and teachers. This could lower the attractiveness of both working at and attending public schools.
This point has relevance in several other critical areas, the first of which is school choice. Both the House and the Senate bills embed school choice into the accountability system: students attending schools that fail to make adequate yearly progress are offered choice within the public sector. (Florida is currently the only state in which students who attend persistently failing schools have the option of attending private schools. However, at this time, none of Florida's public schools is identified as failing.) Many economists find school choice appealing for two reasons. First, it broadens schooling options for families whose choices might otherwise be limited (by such constraints as low incomes, job location, and residential segregation). Second, the increased competition among schools may spur efficiency gains because the loss of student revenues provides schools with an incentive to improve. In the policy arena, an alternative argument for school choice is often put forward: school choice is justified because it provides students with a means of exiting low-quality schools, thus avoiding failure. Economists typically are less comfortable with this logic because it suggests that the state is better than parents at assessing school quality and because it leaves open the question of whether parents who had voluntarily selected low-quality schools in the past would make a better choice if they had more options.
Page and I argue that the structure of a school accountability system may have profound implications for determining which students are offered choice in a program that is integrated with accountability. We find that accountability systems that grade schools on the basis of changes from one year to the next (using either value-added measures or changes in proficiency) tend to provide choice to a much more advantaged group of students than systems that offer choice to students who attend schools with low absolute-status scores or systems that offer choice directly to disadvantaged students. Stated differently, integrating choice and accountability in the manner proposed by Congress would probably be less effective in providing choice to constrained households than offering choice alone, separate from an accountability system.
The design of a school accountability system also may have repercussions that extend beyond the conduct and choice of schools. Maurice Lucas and I recently studied the independent effects of school ranks on real estate markets. (5) Using data on every real estate transaction in the Gainesville, Florida, metropolitan area over a six-year period, we were able to isolate the effect of Florida's imposition of school rankings in May 1999. We find that--independent of test scores, neighborhood quality, or home attributes--the state's assignment of grades on schools had large, statistically significant effects on the prices of houses. Houses within the zones of schools that received a grade of A increased in value by nearly 9 percent relative to those in zones of schools receiving a B; and houses within zones of schools that received a grade of B increased in value by 9 percent relative to those in zones whose schools received a C. Given that the distinctions between such school grade levels are often arbitrary and, as mentioned above, that the design of the school-grading system itself could have important effects on determining which schools earn good grades and which schools earn poor grades, the specific nature of the federal accountability system might have major effects on house prices, thereby directly affecting even those households with no school-age children.
Fifty Ways to Game the System?
The economic rationale behind school accountability systems is that they will provide schools with an incentive to change and to improve their performance. Certainly, accountability systems should lead schools to focus more attention on the basic skills chosen for evaluation (such as reading, writing and mathematics.) Provided the tests are rigorous and represent the broad set of skills that policymakers want students to master, this is a positive aspect of school accountability systems. This increased attention paid to certain specific skills may lead schools to abandon instruction in topics that others in society believe are essential but that are not directly covered by the accountability system. Such a system may also lead to gaming: schools also will have an incentive to manipulate resources (and possibly students) so that their apparent productivity is overstated. For example, if they are rated using a value-added system, schools might transfer their most effective teachers from the "baseline" grade to the "evaluation" grade. Or they might shuffle students among schools--or into special-education classes--to alter the composition of student groups who take the accountability test at a school. Of course, these behaviors may not be purely manipulative, and they may have productive consequences as well. As I mentioned in the introduction to this essay, I am currently studying the myriad ways in which schools respond to accountability systems and the performance effects of these systems.
While it is too early to judge the degree to which schools respond to these incentives to manipulate resources, it is possible to draw some inferences from my recent work on local government (including school) responses to fiscal accountability measures. Arthur O'Sullivan and I asked whether local governments that are faced with binding tax-and-expenditure limitations choose to manipulate their "service mix" in order to induce voters to override such limitations. (6) We identify conditions under which the incentives to do so should be strongest and find that cities and school districts are most likely to change their behaviors as predicted when the incentives are strongest. Although the context is different, the fundamental lessons of that paper can be applied to school accountability systems as well.
In summary, the new school-accountability measures put in place by many states (and probably the federal government) will likely change some of the ways schools do business. Of course, this is the intent of these policies. But when designing these systems, we must be careful to minimize the incentives for schools to manipulate their resources in an ultimately unproductive manner and to ensure that the schools labeled "excellent" or "failing" truly are excellent or failing along the desired lines. In addition, we should try to avoid creating the impression of randomness in the assignment of school grades. The more random the rewards and punishments appear to be, the less likely the schools will effect meaningful change. The literature on tax limitation provides another lesson that might be applied here: Kim Rueben and I find that tax limitations led to a worse degradation of the teaching force than might have been expected on the basis of the actual changes in resources and salaries.6 Likewise, perceived randomness in school grading might change the nature of the teaching force. Although this last point is speculative, our concern is genuine.
1. Figlio is a Faculty Research Fellow in the NBER's Program on Children and the Walter J. Matherly Professor of Economics at the University of Florida. His "Profile" appears later in this issue.
2. T. J. Kane, D. O. Staiger, and J. Geppert, "Assessing the Definition of 'Adequate Yearly Progress' in the House and Senate Education Bills," UCLA Working Paper, July 2001.
3. D. N. Figlio and M. E. Page, "Can School Choice and School Accountability Successfully Coexist?" forthcoming in The Economics of School Choice, C. Hoxby, ed., University of Chicago Press.
6. D. N. Figlio and A. O'Sullivan, "The Local Response to Tax Limitation Measures: Do Local Governments Manipulate Voters to Increase Revenues?" Journal of Law and Economics, 44 (April 2001), pp.233-57.