2061 Connections
An electronic newsletter for the science education community

May/June 2006

No Easy Answers for Middle School Science Assessment

Multiple-choice items pose special challenges for developers

Given the mounting pressure for more extensive testing of students, concerns about the quality of test items are also growing. In science, Project 2061 is helping to address these concerns by developing an online collection of assessment resources that meet rigorous criteria for high quality. Useful for researchers, teachers, and textbook and test developers, the items in the Project 2061 collection are being carefully designed to measure as precisely as possible what students do and do not know about the ideas and skills that are targeted in national and state science standards. (Read an overview of the project.)

Funded by the National Science Foundation, Project 2061’s research team is taking an unusually rigorous approach to developing assessment items. By first examining the alignment of each item to the ideas in a science standard and then taking a close look at other features of the item that might make it difficult to interpret students’ responses, the team is developing items that are precise measures of students’ understanding of the targeted ideas. Although the collection may also include examples of high-quality performance tasks and constructed response tasks, the majority of the items will be multiple-choice questions.

This article focuses on issues that are particularly relevant to multiple-choice questions, which are likely to remain the most widely used item type for high-stakes testing and for classroom evaluation. Project 2061 researchers are examining these issues and coming up with new strategies for improving the quality of items. Articles in upcoming issues of 2061 Connections will look at other aspects of Project 2061’s assessment collection in more detail.

What Are We Testing?

In designing multiple-choice assessment items to measure student understanding of the learning goals in Benchmarks for Science Literacy (Benchmarks) (American Association for the Advancement of Science, 1993) and the National Science Education Standards (NSES) (National Research Council, 1996), we expect students to demonstrate their knowledge of the targeted ideas in a variety of ways—from basic recognition of the idea to more sophisticated applications of the idea. At the most basic level, we want students to recognize a true statement of the idea being tested. But we also want them to use their knowledge to analyze a given situation, to explain relevant phenomena, and to make predictions regarding phenomena using that knowledge. Our goal is to develop a range of multiple-choice test items that challenge students to use their knowledge in ways that can help educators pinpoint where students are in their understanding of an idea.

To further increase the precision of a test item's alignment to a learning goal, Project 2061 is preparing a clarification of the ideas being targeted by each learning goal covered in its collection and identifying misconceptions that students may have about the ideas each learning goal targets. Then, in writing multiple-choice questions, we require that each answer choice—not just the correct one—be aligned to either the ideas in the clarification statement or the related misconceptions. This rigorous specification of content alignment is at the core of Project 2061’s approach to assessment. It requires that students be able to evaluate the truth and relevance of each answer choice based on the knowledge specified in the learning goal itself. (Both the clarifications and descriptions of misconceptions, along with other assessment resources, will be available in Project 2061's online collection.)

Answer Choices

Multiple-choice questions in Project 2061’s collection have four answer choices, and each answer choice is constructed to provide a different piece of information about students' knowledge. One choice, of course, is the correct answer—an accurate prediction of an event, an accurate explanation of a phenomenon, or an accurate statement of a scientific principle. Incorrect answer choices are drawn from the student misconceptions found in the research literature or identified through Project 2061's own research with students. All answer choices should be reasonable or plausible to students, but none of them should fall outside the range of what students who understand the targeted ideas can comprehend. Students who do not know the idea should not be able to eliminate an answer choice because it sounds absurd, for example.

But students who do understand the targeted idea should not get the item wrong because the general vocabulary is unfamiliar or the mental processing that is required is too difficult. In addition, we do not expect students to determine if there is more than one correct answer choice or ask them to select the "best" answer. Few, if any, learning goals in Benchmarks or NSES focus on the relative importance of more than one factor and, therefore, we do not expect students to be able to make such judgments. The guiding principle is this: If students are not likely to have gained the knowledge needed to answer the question through instruction based on the targeted learning goal, then we would not include answer choices that require that knowledge in a test item.

Setting such a high bar for content alignment, while also attending to the overall cognitive abilities of students, can place a significant burden on item developers. To illustrate, consider the following example: Is it reasonable to assume that middle school students have developed sufficiently the ability to use logical operators such as "must" and "could" in their thinking so that terms such as these can legitimately be used in assessing their knowledge of science concepts? Here is a learning goal and a test item that attempts to assess students’ understanding of Newton’s Laws of Motion:

Learning goal: If an object’s speed is increasing, the forces on the object in the direction of motion are stronger than the forces on the object in the opposite direction.

Item: A person pushes a box across the floor. There are two horizontal forces on the box: the force of the push, and the force of friction. The force of friction is in the opposite direction to the box's motion. The speed of the box is increasing. Which of these statements must be true?

  1. The force of the push is stronger than the force of friction. [Correct Answer]
  2. The force of the push is the same strength as the force of friction.
  3. The force of the push is strong.
  4. The force of friction is weak.

For students who understand the ideas being targeted, determining the truthfulness of answer choices A and B is straightforward. Because the speed of the box is increasing, the force of the push (the force in the direction of motion) must be greater than the force of friction (the force in the direction opposite the direction of motion). Similarly, if the forces are the same (answer choice B), the force in the direction of motion cannot be greater than the force in the opposite direction, so answer choice B must be false.

In contrast, answer choices C and D are problematic. Students who understand the relevant ideas (along with the information presented in the item itself) still do not have sufficient knowledge to establish the truthfulness of choice C or D. They can determine that the force of the push is stronger than the force of friction, but they are not able to conclude that the force of the push is strong (answer choice C). In fact, the push could be strong, weak, or anywhere in between, so long as the force of friction is weaker than the force of the push. Similarly, students cannot conclude that the force of friction is weak (answer choice D). Friction also could be weak, strong, or anywhere in between, so long as the force of the push is stronger. Therefore, students are not able to determine the truth of answer choices C and D, as each choice could be true or false.

Remember that the test item asks students to choose the statement that "must be" true, not the statement that "is" true. What does "must be true" mean to a middle school student? The item includes two answer choices that describe conditions that could be true but do not have to be true. When prompted with the words "must be true," does the idea of "could be true" come to mind as an alternative to "must be true"? To an adult, the options are obvious. Using knowledge of the targeted science idea and an understanding of the meaning of "must" (and "could"):

Choice A is true,

Choice B is false,

Choice C could be true, and

Choice D could be true.

Developmentally Appropriate?

Although this kind of thinking is included in a number of benchmarks and standards for students in grades 9-12, it is not an expectation for middle school students. Are middle school students able to understand terms like "must" (and "could") well enough for us to use items like this to assess their understanding of science content? Do such items sometimes lead us to the mistaken conclusion that students do not know the science content when in fact they are uncomfortable with the logical distinctions? Without answers to this and other similarly probing questions about the nature and form of science assessment, students and teachers face significant consequences.

Through its in-depth look at middle school science assessment, Project 2061 is bringing to light a variety of issues and attempting to provide new insights for the science education community. We are seeking answers to the kinds of questions raised by the "must be true" example through pilot testing and through interviewing students about the test items. We will be reporting on these results and raising additional questions in future articles.

Access articles and other information on Project 2061’s assessment work.

# # #

The first set of items in the Project 2061 online collection will be made available for review later this year. In the meantime, if you have had experience with middle school students' ability to use logical reasoning in the ways described here, we would welcome hearing from you. Please contact:

Principal Investigator: Dr. George DeBoer, (202) 326-6624

Research Associate: Dr. Thomas Regan


American Association for the Advancement of Science. (1993). Benchmarks for science literacy. New York: Oxford University Press.

National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.

[Table of Contents]