Lessons Learned from Pilot Testing Assessment Items
Project 2061 presents assessment work at NARST Annual Conference
Project 2061's assessment team is learning that when it comes to designing high-quality assessment items in science, small details can make a big difference. The terminology used in a test item, the familiarity of an item's context, the exact phrasing of answer choices—these and other factors can all affect student answer choices, sometimes in unexpected ways. Confusing or unfamiliar terms may prevent students from answering correctly even when they know the science idea being tested. Likewise, incorrect answer choices that seem implausible may lead students to choose the right answer when they don’t really know the targeted science idea.
In April, Project 2061 researchers traveled to the National Association for Research in Science Teaching (NARST) Annual Conference in New Orleans to share findings from their latest assessment studies. The studies are all part of Project 2061’s effort to build an online collection of high-quality middle school and early high school science assessment items that are linked to national standards. (Read an overview of the assessment project.)
In a comprehensive symposium, Project 2061 reported on how multiple-choice assessment items can be developed that precisely measure student understanding of the ideas specified in Benchmarks for Science Literacy (American Association for the Advancement of Science, 1993) and the National Science Education Standards (National Research Council, 1996). Multiple-choice tests are often criticized for assessing student knowledge of just the facts of science, but multiple-choice tests also can be constructed that ask students to think through more complex situations and to analyze, explain, and predict phenomena. Although a considerable amount of effort is required to construct such test questions, when done well they provide educators with important information about what students know and can do.
Examining Student Thinking
Project 2061’s item development process includes pilot testing the items with students, during which students are asked to provide written comments about the test items and their answer choices.
“During pilot testing, the students reveal things that are almost impossible to predict in advance,” explained George DeBoer, deputy director of Project 2061 and organizer of the NARST symposium. “Although we always think about how much we can take for granted—with respect to vocabulary, with respect to students’ understanding of prior ideas from earlier grades, and with respect to logical reasoning—unless we get feedback from the students themselves, we never know for sure.”
“We learned, for example, that not all middle school students know that the energy from the sun is captured in the leaves of a plant. A test item that we wrote had a tree with leaves, but also some oranges hanging from it. The oranges were there to provide context for the item. We weren’t actually trying to find out what students thought about the oranges. What we found out was that some of the students thought that the light from the sun was captured directly by the oranges, not the leaves. In another question we learned that a significant number of students think that ‘a force in the direction of motion of an object’ is a force in front of the object blocking the object’s path. Asking students to explain why each answer choice is correct or incorrect, and what they find confusing about the test items, is an invaluable part of our item development work.”
The following papers and posters from the NARST Annual Conference provide details about the methodology and findings of Project 2061’s assessment item development and about studies in the topics of controlling variables, thermal expansion and contraction, and plate tectonics:
“Assessment Linked to Science Learning Goals: Probing Student Thinking During Item Development”
George E. DeBoer, Cari Herrmann Abell, & Arhonda Gogos
Read the paper [PDF, 378KB].
“Assessing Students' Understanding of Controlling Variables”
Arhonda Gogos & George E. DeBoer
Thermal Expansion and Contraction:
“Probing Middle School Students' Knowledge of Thermal Expansion and Contraction Through Content-Aligned Assessment”
Cari F. Herrmann Abell & George E. DeBoer
“Determining the Appropriateness of Terminology in Content-Aligned Assessments for Middle School Students: Examples from Plate Tectonics”
Paula N. Wilson & George E. DeBoer
Read the paper [PDF, 88KB].
”Assessing Student Understanding of Plate Tectonics”
Paula N. Wilson & George E. DeBoer
View the poster [PDF, 56KB].
# # #
For more information about Project 2061's assessment research, please contact:
Principal Investigator: Dr. George DeBoer, (202) 326-6624
Research Associate: Dr. Cari Herrmann Abell, (202) 326-6648
American Association for the Advancement of Science. (1993). Benchmarks for science literacy. New York: Oxford University Press.
National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.