The National Assessment of Educational Progress (NAEP) is the largest continuing and nationally representative assessment of what U.S. students know and can do in various subjects. NAEP is a congressionally mandated project administered by the National Center for Education Statistics (NCES), within the Institute of Education Sciences (IES) of the United States Department of Education. The first national administration of NAEP occurred in 1969.[1] The National Assessment Governing Board (NAGB) is an independent, bipartisan board that sets policy for NAEP and is responsible for developing the framework and test specifications.The National Assessment Governing Board, whose members are appointed by the U.S. Secretary of Education, includes governors, state legislators, local and state school officials, educators, business representatives, and members of the general public. Congress created the 26-member Governing Board in 1988.
NAEP results are designed to provide group-level data on student achievement in various subjects, and are released as The Nation's Report Card.[2] There are no results for individual students, classrooms, or schools. NAEP reports results for different demographic groups, including gender, socioeconomic status, and race/ethnicity. Assessments are given most frequently in mathematics, reading, science and writing. Other subjects such as the arts, civics, economics, geography, technology and engineering literacy (TEL) and U.S. history are assessed periodically.
In addition to assessing student achievement in various subjects, NAEP also surveys students, teachers, and school administrators to help provide contextual information. Questions asking about participants' race or ethnicity, school attendance, and academic expectations help policy makers, researchers, and the general public better understand the assessment results.
Teachers, principals, parents, policymakers, and researchers all use NAEP results to assess student progress across the country and develop ways to improve education in the United States. NAEP has been providing data on student performance since 1969.[3] [4]
NAEP uses a sampling procedure that allows the assessment to be representative of the geographical, racial, ethnic, and socioeconomic diversity of the schools and students in the United States. Data is also provided on students with disabilities and English language learners. NAEP assessments are administered to participating students using the same test booklets and procedures, except accommodations for students with disabilities,[5] [6] so NAEP results are used for comparison of states and urban districts that participate in the assessment.
There are two NAEP websites: the NCES NAEP website and The Nation's Report Card website. The first site details the NAEP program holistically, while the second focuses primarily on the individual releases of data.
NAEP began in 1964, with a grant from the Carnegie Corporation to set up the Exploratory Committee for the Assessment of Progress in Education (ESCAPE). The first national assessments were held in 1969. Voluntary assessments for the states began in 1990 on a trial basis and in 1996 were made a permanent feature of NAEP to be administered every two years. In 2002, selected urban districts participated in the state-level assessments on a trial basis and continue as the Trial Urban District Assessment (TUDA).[need citation]
The development of a successful NAEP program has involved many, including researchers, state education officials, contractors, policymakers, students, and teachers.[7]
There are two types of NAEP assessments, main NAEP and long-term trend NAEP. This separation makes it possible to meet two objectives:
Main NAEP assessments are conducted in a range of subjects with fourth-, eighth- and twelfth-graders across the country. Assessments are given most frequently in mathematics, reading, science, and writing. Other subjects such as the arts, civics, economics, geography, technology and engineering literacy (TEL), and U.S. history are assessed periodically.
These assessments follow subject-area frameworks that are developed by the NAGB and use the latest advances in assessment methodology.[8] Under main NAEP, results are reported at the national level, and in some cases, the state and district levels.
National NAEP reports statistical information about student performance and factors related to educational performance for the nation and for specific demographic groups in the population (e.g., race/ethnicity, gender). It includes students from both public and nonpublic (private) schools and depending on the subject reports results for grades 4, 8, and 12.
State NAEP results are available in some subjects for grades 4 and 8. This allows participating states to monitor their own progress over time in mathematics, reading, science, and writing. They can then compare the knowledge and skills of their students with students in other states and with the nation.
The assessments given in the states are exactly the same as those given nationally. Traditionally, state NAEP was assessed only at grades 4 and 8. However, a 2009 [9] pilot program allowed 11 states (Arkansas, Connecticut, Florida, Idaho, Illinois, Iowa, Massachusetts, New Hampshire, New Jersey, South Dakota, and West Virginia) to receive scores at the twelfth-grade level.
Through 1988, NAEP reported only on the academic achievement of the nation as a whole and for demographic groups within the population. Congress passed legislation in 1988 authorizing a voluntary Trial State Assessment. Separate representative samples of students were selected from each state or jurisdiction that agreed to participate in state NAEP. Trial state assessments were conducted in 1990, 1992, and 1994. Beginning with the 1996 assessment, the authorizing statute no longer considered the state component a "trial.”
A significant change to state NAEP occurred in 2001 with the reauthorization of the Elementary and Secondary Education Act, also referred to as "No Child Left Behind" legislation. This legislation requires that states which receive Title I funding must participate in state NAEP assessments in mathematics and reading at grades 4 and 8 every two years. State participation in other subjects assessed by state NAEP (science and writing) remains voluntary.
Like all NAEP assessments, state NAEP does not provide individual scores for the students or schools assessed.
The Trial Urban District Assessment (TUDA) is a project developed to determine the feasibility of using NAEP to report on the performance of public school students at the district level. As authorized by congress, NAEP has administered the mathematics, reading, science, and writing assessments to samples of students in selected urban districts.
TUDA began with six urban districts in 2002, and has since expanded to 27 districts for the 2017 assessment cycle.
Long-term trend NAEP is administered to 9-, 13-, and 17-year-olds periodically at the national level. Long-term trend assessments measure student performance in mathematics and reading and allow the performance of today's students to be compared with students since the early 1970s.
Although long-term trend and main NAEP both assess mathematics and reading, there are several differences between them. In particular, the assessments differ in the content assessed, how often the assessment is administered, and how the results are reported. These and other differences mean that results from long-term trend and main NAEP cannot be compared directly.[10]
Although NAEP has been administered since the 1970's, in 2021 US DOE officials have decided to postpone the assessment in math and reading due to the COVID-19 pandemic. The reasons for postponing include the possibility of skewed student samples as well as results due to differing distance learning options and because of safety concerns for proctors and students.[11]
NAGB sets the calendar for NAEP assessments. Please refer to the entire assessment schedule for all NAEP assessments since 1968 and those planned through 2017.
Main NAEP assessments are typically administered over approximately six weeks between the end of January and the beginning of March of every year. Long-term trend assessments are typically administered every four years by age group between October and May. All of the assessments are administered by NAEP-contracted field staff across the country.
NAEP is conducted in partnership with states. The NAEP program provides funding for a full-time NSC in each state. He or she serves as the liaison between NAEP, the state's education agency, and the schools selected to participate.
NSCs provide many important services for the NAEP program and are responsible for:
While most NAEP assessments are administered in a paper-and-pencil based format, NAEP is evolving to address the changing educational landscape through its transition to digitally-based assessments. NAEP is using the latest technology available to deliver assessments to students, and as technology evolves, so will the nature of delivery of the DBAs. The goal is for all NAEP assessments to be paperless by the end of the decade. The 2011 writing assessment was the first to be fully computer-based.
In 2009, ICTs were administered as part of the paper-and-pencil science assessment. The computer delivery affords measurement of science knowledge, processes, and skills not able to be assessed in other modes. Tasks included performance of investigations that include observations of phenomena that would otherwise take a long time, modeling of phenomena on a very large scale or invisible to the naked eye, and research of extensive resource documents.
This special study in multi-stage testing, implemented in 2011, investigated the use of adaptive testing principles in the NAEP context. A sample of students were given an online mathematics assessment which adapts to their ability level. All of the items in the study are existing NAEP items.
The TEL assessment framework describes technology and engineering literacy as the capacity to use, understand, and evaluate technology as well as to understand technological principles and strategies needed to develop solutions and achieve goals. The three areas of the assessment are:
Eighth-grade students throughout the nation took the assessment in winter of 2014. Results from this assessment were released in May 2016.
In 2011, NAEP transitioned its writing assessment (at grades 8 and 12) from paper and pencil to a computer-based administration in order to measure students' ability to write using a computer. The assessment takes advantage of many features of current digital technology and the tasks are delivered in multimedia formats, such as short videos and audio. Additionally, in an effort to include as many students as possible, the writing computer-based assessment system has embedded within it several universal design features such as text-to-speech, adjustable font size, and electronic spell check. In 2012, NAEP piloted the computer-based assessment for students at grade 4.
In addition to the assessments, NAEP coordinates a number of related special studies that often involve special data collection processes, secondary analyses of NAEP results, and evaluations of technical procedures.
Achievement gaps occur when one group of students outperforms another group and the difference in average scores for the two groups is statistically significant (that is, larger than the margin of error). In initial report releases NAEP highlights achievement gaps across student groups. However, NAEP has also releases a number of reports and data summaries that highlight achievement gap. – Some examples include the School Composition and the Black-White Achievement Gap and the Hispanic-White and the Black-White Achievement Gap Performance.[12] These publications use NAEP scores in mathematics and/or reading for these groups to either provide data summaries or illuminate patterns and changes in these gaps over time. Research reports, like the School Composition and Black-White Achievement Gap, also include caveats and cautions to interpreting the data.
The HSTS explores the relationship between grade 12 NAEP achievement and high school academic careers by surveying the curricula being followed in our nation's high schools and the course-taking patterns of high school students through a collection of transcripts. Recent studies have placed an emphasis on STEM education and how it correlates to student achievement on the NAEP mathematics and science assessments.
The Trends in International Mathematics and Science Study (TIMSS) is an international assessment by the International Association for the Evaluation of Educational Achievement (IEA) that measures student learning in mathematics and science. NCES initiated the NAEP-TIMSS linking study so that states and selected districts can compare their own students' performance against international benchmarks. The linking study was conducted in 2011 at grade 8 in mathematics and science. NCES will "project", state and district-level scores on TIMSS in both subjects using data from NAEP.
The NIES is a two-part study designed to describe the condition of education for American Indian/Alaska Native students in the United States. The first part of the study consists of assessment results in mathematics and reading at grades 4 and 8. The second part presents the results of a survey given to American Indian/Alaska Native students, their teachers and their school administrators. The surveys focus on the students' cultural experiences in and out of school.
Under the 2001 reauthorization of the Elementary and Secondary Education Act (ESEA) of 1965, states develop their own assessments and set their own proficiency standards to measure student achievement. Each state controls its own assessment programs, including developing its own standards, resulting in great variation among the states in statewide student assessment practices. This variation creates a challenge in understanding the achievement levels of students across the United States.Since 2003, NCES has supported research that compares the proficiency standards of NAEP with those of individual states. State assessments are placed onto a common scale defined by NAEP scores, which allows states' proficiency standards to be compared not only to NAEP, but also to each other. NCES has released the Mapping State Proficiency Standards report using state data for mathematics and reading in 2003, 2005, 2007, 2009, and most recently 2013.[13]
Over the years, NCES has conducted a number of other studies related to different aspects of the NAEP program. A few studies from the recent past are listed below:
NAEP's heavy use of statistical hypothesis testing has drawn some criticism related to interpretation of results. For example, the Nation's Report Card reported "Males Outperform Females at all Three Grades in 2005" as a result of science test scores of 100,000 students in each grade.[14] Hyde and Linn criticized this claim, because the mean difference was only 4 out of 300 points, implying a small effect size and heavily overlapped distributions. They argue that "small differences in performance in the NAEP and other studies receive extensive publicity, reinforcing subtle, persistent, biases."[15]
NAEP's choice of which answers to mark right or wrong has also been criticized, a problem which happens in other countries too.[16] For example, a history question asked about the 1954 Brown v. Board of Education ruling, and explicitly referred to the 1954 decision which identified the problem, not the 1955 decision which ordered desegregation. NAEP asked students to "describe the conditions that this 1954 decision was designed to correct." They marked students wrong who mentioned segregation without mentioning desegregation. In fact the question asked only about existing conditions, not remedies, and in any case the 1954 decision did not order desegregation.[17] [18] The country waited until the 1955 Brown II decision to hear about "all deliberate speed." Another history question marked students wrong who knew the US fought Russians as well as Chinese and North Koreans in the Korean War. Other released questions on math and writing have had similar criticism. Math answers have penalized students who understand negative square roots, interest on loans, and errors in extrapolating a graph beyond the data.[19] [20]
NAEP's claim to measure critical thinking has also been criticized. UCLA researchers found that students could choose the correct answers without critical thinking.[21]
NAEP scores each test by a statistical method, sets cutoffs for "basic" and "proficient" standards, and gives examples of what students at each level accomplished on the test. The process to design the tests and standards has been criticized by Western Michigan University (1991), the National Academy of Education (1993), the Government Accountability Office (1993), the National Academy of Sciences (1999),[22] [23] the American Institutes for Research and RTI International (2007), Brookings Institution (2007 and 2016), the Buros Center for Testing (2009),[22] and the National Academies of Sciences, Engineering, and Medicine (2016).
Interpretation of NAEP results has been difficult: NAEP's category of "proficient" on a reading test given to fourth graders reflects students who do well on the test and are at seventh grade level.[24] NAEP's category of "proficient" on a math test given to eighth graders reflects students who do well on the test and are at twelfth grade level.[25] The fact that few eighth graders are proficient by this standard and achieve at twelfth grade level has been misinterpreted to allege that few eighth graders achieve even at eighth grade level.[26] NAEP says, "Students who may be proficient in a subject, given the common usage of the term, might not satisfy the requirements for performance at the NAEP achievement level"[24] James Harvey, principal author of A Nation at Risk, says, "It's hard to avoid concluding that the word was consciously chosen to confuse policymakers and the public."[24]