Aligning Student Assessment to State and National Content Standards
George E. DeBoer
AAAS Project 2061
NSTA National Convention
Dallas, TX, April 1, 2005
Why align assessment with content standards?
Student assessments have long served as an indicator of educational
success and can be a powerful force in improving curriculum and instruction.
Current reform efforts emphasize the importance of aligning assessment
items with important learning goals. Developers and users of assessments
are continually faced with the challenge of determining whether an
assessment item can effectively reveal what students know and are
able to do with respect to the content standards. Project 2061 has
developed a procedure for examining the alignment of assessment items
with the ideas and skills they were written to assess. The procedure
is useful to national and state assessment developers and to curriculum
developers and classroom teachers who use test items as a basis for
important instructional decisions. In this session we will demonstrate
the alignment procedure and provide examples of items that are aligned
and not aligned to national standards including the NRC’s National
Science Education Standards and AAAS’s Benchmarks
for Science Literacy.
Judging alignment of assessment items
The AAAS Project 2061 assessment analysis procedure (AAAS, 2004)
examines items for their alignment to the exact ideas specified in
the content standards and for features that might make interpretation
of student understanding difficult. For example, an item that uses
language that is out of reach of the students would provide questionable
evidence of whether the student understands the ideas in the content
standard.
Targeting content knowledge
The first two questions we ask when determining alignment to content
standards are: (1) Is the knowledge or skill specified in the content
standard needed to produce a satisfactory response or can the task
be completed some other way? (2) Is the knowledge enough by itself
to make a satisfactory response or is other knowledge also needed?
Items may be found to align with more than one content standard or
to a content standard other than the one targeted. Items may be found
to be unrelated to any of the ideas they are intended to measure.
This procedure does not provide a quantitative measure of alignment
but rather a description of what the item is measuring. The results
of the analysis provide a basis for modifying the item so that it
can be made to align more closely with a targeted content standard
and for informing judgments that are made about what students do and
do not know about a given science idea.
Effectively probing student understanding
Even if an item is well matched to the ideas in a content standard,
a number of factors can influence the likelihood that an item will
or will not enable someone to draw valid conclusions about student
knowledge. Features such as comprehensibility, clarity of expectations,
and resistance to test-wiseness can significantly contribute to an
item’s potential usefulness. Are students likely to understand
the task statement, diagrams and symbols? Is the task context appropriate?
Could students respond satisfactorily by simply guessing or using
other general test-taking strategies? Considerations such as these
can reveal critical flaws in items that could undermine their potential
to be effective.
Aims of analysis
It should be noted that this analysis procedure does not evaluate
the merit of the content standards addressed by an item or by sets
of items, although issues having to do with how clearly the content
standards are stated do arise during the analysis of the assessment
items; nor does it address the effect that any particular item might
have on the psychometric features of a scale that an item is part
of. By focusing on an item’s targeted knowledge and its likely
effectiveness as a probe of student knowledge, the procedure helps
to articulate exactly what is being tested by a particular item, thus
improving the validity of interpretations that can be made from performance
results. Obviously one item or even a set of items, can never give
us complete confidence that students understand or do not understand
an idea, but every item should have the potential to provide evidence
of student understanding. The exact number of items needed for adequate
assessment of any particular idea or skill remains an empirical question.
Case study: Evaluating assessments from middle school chemistry
and biology
In response to the need for curricula closely aligned with national
science education standards, the University of Michigan and Northwestern
University are developing a project-based science curriculum for middle
school (Krajcik & Reiser, in preparation). Their work began with
the development of two units — a 7th grade chemistry unit and
an 8th grade biology unit (Reiser et al., 2003). The two units were
developed using an assessment-driven design approach (Wiggins & McTighe,
1998) in which very specific learning objectives (derived directly
from content standards) were clearly articulated beforehand and then
subsequently used to guide the design of both instructional activities
and assessment tasks (McNeill et al., 2003).
The three main content areas addressed by the chemistry unit are:
(1) Substances and their properties, (2) chemical reactions, and (3)
conservation of matter. The biology unit focuses on ideas around (1)
diversity of life and (2) the interdependence of life. The assessment
instruments administered to students during the pilot study of the
curriculum project included both multiple choice and open-ended items.
These items were written to assess student understanding of specific
content standards related to the main content areas.
Assessment items from this curriculum development project are used
in this presentation to illustrate how the Project 2061 assessment
analysis procedure can be used to determine how well assessment items
are aligned to the targeted content standards and whether any features
of the items may interfere with accurately determining what students
do and do not know with respect to the ideas in those content standards
and the conclusions that can be drawn from them about student learning.
(In most cases these are early drafts of items that have subsequently
been revised and are used only to illustrate the procedure.) Student
responses were reviewed for content alignment, the frequency that
certain distracters were chosen, what the items tell us about student
misconceptions, specific factors that might influence comprehensibility,
and the challenge of measuring deep understanding of concepts and
the application of knowledge and skills. The results show items that
are closely aligned with targeted learning goals and those that assume
more of students than the learning goal specifies. The analysis points
to items that can be answered without the knowledge specified in the
learning goals and those that can be answered only with that knowledge.
For example, in one of the items, students can answer the item
correctly if they know that boiling is a phase change. However, the
item was intended to find out if students know that phase change is
not a chemical reaction. Although the item is classified as assessing
understanding of chemical reactions, it actually assesses knowledge
of boiling as a phase change. Awareness of this fact prevents questionable
conclusions from being drawn about student understanding of the ideas
or skills the item was written to assess. The results also show items
that require additional knowledge or skill where the exact nature
of the knowledge or skill is not specified in the content standard.
In these cases, decisions must be made by those interpreting assessment
results about whether or not that is knowledge and skills that all
students of that age can be expected to have or if the items that
incorporate that knowledge and skills are measuring something other
than what was intended.
For More Information:
For more information on the Project 2061 Assessment
Analysis Procedure or the slides for this presentation, contact George
DeBoer.
The slide presentation for "Aligning Student
Assessment to State and National Content Standards" is also available
[PDF 901KB].
References
American Association for the Advancement of Science (2004). Assessment
with Precision: Project 2061 Building a Collection of Test Items
Aligned to Standards. 2061 Today, Vol. 14, No. 2.
Krajcik, J., & Reiser, B. J. (Eds.). IQWST: Investigating
and Questioning our World Through Science and Technology.
Ann Arbor, MI: University of Michigan (in preparation).
McNeill, K. L., Lizotte, D. J., Harris, C. J., Scott, L. A., Krajcik,
J., & Marx, R. (2003). Using backward design to create standards-based
middle school inquiry-oriented chemistry curriculum and assessment
materials. Paper presented at the Annual Meeting of the National
Association of Research in Science Teaching, Philadelphia, PA.
Reiser, B. J., Krajcik, J., Moje, E., & Marx, R. (2003). Design
strategies for developing science instructional materials. Paper
presented at the Annual Meeting of the National Association for Research
in Science Teaching, Philadelphia, PA. Wiggins, G. P., & McTighe,
J. (1998). Understanding by design. Alexandria, VA: Association
for Supervision and Curriculum Development.
This work is funded by the National Science
Foundation (ESI 035247).