Identifying Curriculum Materials for Science Literacy: A Project 2061 Evaluation
Tool
This report describes an early version of Project 2061's curriculum-materials
evaluation procedure. The report is based on a paper prepared for the Colloquium
"Using the National Science Education Standards to Guide the Evaluation, Selection,
and Adaptation of Instructional Materials," which was held at the National
Research Council, November 10-12, 1996.
Introduction
With Project 2061’s publication of Science for All
Americans (1989) and Benchmarks for Science
Literacy (1993) and the National Research Council’s release of
the National Science Education Standards (1996),
there now exists a strong national consensus among educators and scientists
on what all K-12 students need to know and be able to do in science, mathematics,
and technology. The overwhelming similarity between Benchmarks and
the NSES means that curriculum materials that support students’
learning benchmarks will likewise promote learning science standards. Valid
identification of such curriculum materials is of great interest to educators
nationwide, even more so since state and district frameworks are drawing heavily
on the national documents.
Beginning in 1991, Project 2061 sought to quickly develop a database of reviewed
materials that could be used in making local adoption decisions. Because the
pool of existing materials was large there was no way that as few people as
the Project 2061 staff could analyze them all. To engage a large number of
people in analysis, it would be necessary to have an analysis procedure that
could be used, with a reasonable amount of training, to give valid and reliable
results. By valid, we mean that the conclusions reached would derive from
accurate interpretations of benchmarks and sound principles of effective teaching.
By reliable, we mean that independent analysts would reach similar conclusions
and cite similar evidence for them. Unfortunately, no procedure existed for
judging whether or how well curriculum materials matched specific learning
goals like benchmarks. We found that, by using only the impressionistic or
check-list procedures that are common in curriculum evaluation, neither other
people’s judgments nor our own yielded consistent or defensible results.
We therefore began a major effort to develop an adequately valid and reliable
procedure, and we have made considerable progress.
Premises
The central proposition in the Project 2061 procedure for analyzing curriculum
materials is that they are to be judged primarily in terms of how well
they are likely to contribute to the attainment of specific learning goals.
While it is certainly important that materials are scientifically accurate,
age appropriate, and motivating, if they do not also contribute significantly
to students’ learning important and agreed upon ideas and skills, they
are not suitable for adoption. Hence the Project 2061 procedure concentrates
on examining curriculum materials in the light of a coherent set of learning
goals. The particular learning goals used here are those found in Benchmarks
and in the content component of National Science Education Standards,
but in principle any well thought out goal statements could be used as long
as they are learning goals and as long as they are specific.
A second premise of the Project 2061 approach to the analysis of curriculum
materials is that account has to be taken of both the content and
instructional properties of the materials under examination. It does little
good for materials to simply include the content of specific learning goals
if the instructional strategies recommended in the material are not consistent
with what is known about how students learn. To bring home this point, consider
that Benchmarks for Science Literacy is a perfect content match to
benchmarks—it contains explicit and specific material that is relevant
to all benchmarks and does not contain any material that goes beyond them—yet
no one would argue that Benchmarks should be used as a student textbook.
Mere statement of the goals is far from enough to help students to achieve
them. For materials to be credible, it must be evident how students could
actually learn what is intended from them.
The Procedure in Brief
The analysis of curriculum materials described here involves four steps, described
briefly here and in greater detail in Appendix A:
Preliminary Inspection.
This is to determine whether the material merits further analysis and, if so,
to identify the learning goals that will serve as the focus of further study.
Content Analysis.
The purpose here is to determine whether the content in the material matches
specific learning goals—not just whether the topic headings are similar.
(At the "topic" level it is possible to align most any curriculum
with Benchmarks or NSES. This step in the analysis demands more
than a topic correspondence.) Reviewers proceed to the next step only if results
of this analysis are promising.
Instructional Analysis.
This looks at the match between the material’s treatment of specific learning
goals and what is known about student learning and effective teaching. The
purpose here is to estimate how well the instructional strategies in the material
support student learning those very ideas and skills for which there
is a content match. It should be possible to point to evidence of effective
instruction in the material, benchmark by benchmark. (It is possible that
materials would both (a) show a content match to particular benchmarks and
(b) have plausible instructional strategies in general, yet not focus those
strategies on those particular benchmarks.)
Summary Report.
The analysis concludes with a summary of what the material under consideration
can be expected to accomplish in terms of specific learning goals.
We have found this four-step procedure is helpful to those doing the analysis
for the first time. More experienced users of the procedure tend to combine
some of the steps. For example, as knowledge of benchmarks and standards increases,
the preliminary examination may be combined with the content analysis.
Features
There are several important features of the 2061 procedure that are key to
its design. The first four are strategic elements of the procedure itself,
the last four characterize the tools available to users. Taken together, these
features distinguish the 2061 analysis procedure from other evaluation forms.
- Specific Learning Goals. Both Benchmarks and NSES
specify what students should know or be able to do at a fairly fine grain
size. The 2061 analysis procedure examines the alignment of materials
to specific learning goals—benchmarks and fundamental understandings—rather
than to section heading or general topics. For example, within the section
"The
Earth," Benchmarks specifies precisely what students should
know about the water cycle by the end of grades 2, 5, 8, and 12. By the
end of grade 2 students should know that "Water left in an open container
disappears, but water in a closed container does not disappear."
By the end of grade 5 students should know the (more sophisticated) idea
that "When liquid water disappears it turns into a gas…"
and by the end of grade 8 students should be able to explain evaporation
in terms of molecules. A first grade activity that has students comparing
yearly rainfall patterns around the world addresses the topic "water
cycle" but not the substance of the K-2 benchmark "Water left
in a closed container disappears, but water in an open container does
not disappear." But having students observe the water loss in their
classroom fish tank after a holiday and investigating whether covering
it solves the problem is more to the point. Benchmarks and NSES
provide a set of developmentally appropriate goals based on learning research,
making it unnecessary for reviewers to intuit what is appropriate for
students in different grades.
- Instruction Tied to Learning Goals. The 2061 procedure examines
how well the instructional and assessment strategies in the material contribute
to students’ learning the specific benchmarks. For example, criteria
that probe whether or not the material "includes activities that
provide first-hand experiences with phenomena" and "includes
question sequences to guide student interpretation and reasoning about
phenomena" ask evaluators to examine whether the material contains
activities and calls for reflection on each benchmark to be learned.
Similarly, the assessment tasks are examined for their match to specific
benchmarks. It is important to note that in looking at many materials
in terms of many different benchmarks, we have seen that a material may
treat one benchmark quite well and another quite poorly. That makes it
difficult to form good global judgments. It’s possible, of course,
to consider whether a material employs generally supportive instructional
strategies such as hands-on activities and cooperative groups, independent
of what learning goals they may contribute to. But the instructional strategies
might turn out to be aimed at only some of the learning goals of interest—or
none at all. (This is the most central proposition of our work: The desirability
of instruction and content cannot be considered separately!)
- Evidence-based Arguments. The 2061 analysis procedure requires
that judgments are supported by evidence-based arguments and take account
of both the quantity and the quality of the evidence. The 2061 criteria
are not used as a check list, nor can analysts get by with giving unsupported
opinions. Analysis reports include descriptions of the supporting evidence
and page number references—benchmark by benchmark—so that readers
can check them and form independent judgments (and judge the credibility
of the reviewer). In coming to conclusions about the quality of evidence,
a reviewer might argue that a material that includes carefully sequenced
questions to prompt student reflection on a particular phenomenon
is deserving of a higher rating than a material that simply instructs
teachers to "encourage students to discuss their ideas about the
motion of molecules." We have found that when reviewers are required
to justify their conclusions, the quality and reliability of their judgments
improves.
- Feedback. The 2061 procedure involves teams of educators who examine
and comment on one another’s judgments and supporting evidence. For
example, one team member may question whether activities in a curriculum
material that claimed to match a particular benchmark really do. Another
may question whether the supporting evidence included in a report truly
responds to a criterion. Holding judgments and evidence up to such scrutiny
improves the quality of the analysis reports, and having successfully
defended their claims builds reviewers’ confidence in their judgments.
Team members have indicated both how much they value good feedback when
it is given and miss it when it is not.
- Clarification of Learning Goals. The meaning of any learning goal,
however specific and clearly written it may be, is nonetheless subject
to the interpretation of users. We have observed that simply reading a
benchmark is often insufficient to grasp its intent, because users typically
couch it in their own idiosyncratic understanding. This causes them to
both overestimate and underestimate what a benchmark is expecting students
to know or be able to do (Roseman, 1997). Before
attempting to match activities in a curriculum material to the content
of any benchmark, review teams study Science for All Americans
and Benchmarks for Science Literacy to clarify the benchmark’s
meaning. Science for All Americans provides a narrative context
that clarifies where the benchmark is aiming. Other benchmarks before
or after a benchmark in a K-12 list clarify the level of sophistication
intended in the one under consideration. A strand map from Benchmarks
on Disk provides still more context in terms of what goes into understanding
a benchmark and where it will lead. Benchmarks essays describe
difficulties students may have with the benchmark topic and offer some
suggestions for helping students achieve the benchmark. Research summaries
suggest likely limitations in student understanding of the benchmark and
provide rationale for its grade-level placement.
- Specific Criteria. The 2061 procedure uses highly specific analysis
criteria. For example, materials are examined for how well they alert
teachers to commonly held student ideas (both troublesome and helpful)
such as those described in Benchmarks Chapter
15: The Research Base and then for how well the material explicitly
addresses those commonly held student ideas. In contrast, other procedures
ask for more general impressions: For example, "Do the materials
reflect current knowledge about effective teaching and learning practices
based on research related to science education?" The more general
the criterion, the more open it is to varied interpretations and sampling
variations and hence the less likely it is that different analysts will
reliably interpret it and respond accordingly.
- Clarification of Criteria. To help users interpret the criteria,
the 2061 analysis procedure gives the rationale for the analysis criteria
and elaborates what a response to each should include. For example, the
question "Does the material alert teachers to commonly held student
ideas?" is clarified with the following paragraphs:
Students usually have ideas about how the world works even before instruction.
Some ideas are intuitions that are in basic agreement with the scientists’
views, others (labeled often as misconceptions) are in disagreement/conflict
with currently accepted scientific theories. Some of the students'
misconceptions work fairly well in familiar contexts and are highly
resistant to change. Knowing the ideas that students typically have
helps teachers decide what ideas to build on and what changes to promote…
Responding to this question involves examining a) whether there is
research on commonly held student ideas in the topic area(s) that
the material addresses, b) whether the material alerts teachers to
such ideas, c) whether the material accurately represents research
findings, and d) what proportion of commonly held ideas identified
by the research are described in the material. Summaries of research
on students ideas in science (such as those included in Benchmarks
Chapter
15: The Research Base or Making Sense of Secondary Science:
Research Into Children’s Ideas, by Rosalind Driver, Ann Squires,
and Valerie Wood-Robinson. New York: Routledge, 1994) will
be helpful to reviewers who want to know what ideas students typically
have about the topics that the curriculum material they are examining
addresses. If there is no research on student ideas in the topic area(s)
that the material addresses, the material should not be faulted for
not addressing this criterion.
- Concrete Examples of Applying Criteria. The 2061 analysis procedure
employs examples of applying the criteria to specific goals and specific
materials. Case study reports of fully analyzed materials are provided
to illustrate the use of the criteria, what a good argument consists of,
and what evidence justifies each rating—high, medium, and low. To
illustrate a "high" score on the criterion "Reflecting
on Activities" the following example is used. The material in question,
Matter and Molecules (Berkheimer, et. al,
1988), is attempting to teach the idea that "All matter is composed
of molecules that are constantly moving." During an activity in which
students place hard candy in both hot and cold water, they are asked to
make some predictions:
- How do you think what happens in the two cups will be
the same?
How do you think what happens in the
two cups will be different?
Explain your predictions.
[After 10 minutes, students are asked to look at the two
cups and compare them.]
- How are the two cups the same?
How are they different?
- There are many ways that the two cups are the same after
10 minutes, and one important way is that some of the
candy dissolved in each cup. Try to write an explanation
of how this happened. Remember to answer the question
about substances and the question about molecules in your
explanation.
- An important difference is that the candy dissolved faster
in one of the cups. In which cup did the candy dissolve
faster?
What was different about the molecules of hot and cold
water that would make the candy dissolve faster?
|
As was evident in comparing our first cycle to our second cycle of development
(described below), including all of the features described above greatly increases
the validity and reliability of the analysis procedure. Furthermore, because
of the specificity and detail provided in the analysis reports, results of
the 2061 procedure can facilitate attempts to better align curriculum materials
with benchmarks and standards.
Developing and Testing the Procedure
In developing and trying out the analysis procedure, Project 2061 has involved
over 100 K-12 teachers, teacher educators, materials developers, cognitive
researchers, and scientists. In two cycles of materials evaluation 3-person
teams used the procedure to analyze a few materials, reported their findings,
and suggested modifications to the procedure. In each cycle, the reliability
of the analysis reports was tested by having two teams analyze each curriculum
material independently. A similar strategy was used at 6 national sites—both
statewide and school-district—to test the procedure under field conditions.
A list of materials examined is provided in Appendix
B.
We have learned that the procedure can be reliably used only if the following
conditions are met:
- Review teams have more than a superficial understanding of the content
to be learned.
- Review teams understand what constitutes an appropriate level of sophistication
of that content for K-12 students. (It is not enough that analysts
are familiar with the content, they should also know what of that
content contributes to literacy at various grades—for Benchmarks
this has been spelled out for K-2, 3-5, 6-8, and 9-12; for Standards,
for K-4, 5-8, and 9-12.)
- Review teams are knowledgeable about reported difficulties that students
have learning that content. (Benchmarks Chapter
15: The Research Base summarizes research on students’ ideas
that contributed to the substance and grade-level placement of benchmarks.
The new Project 2061 tool Resources for Science Literacy (AAAS,
1997) supplements Benchmarks with descriptions of over
100 articles, reports, books, and videos that summarize difficulties
students have with Benchmarks ideas)
- Review teams have undergone considerable training in the use of the
analysis procedure that includes practice with feedback. (A minimum
of 4 days is needed to teach the procedure but this needs to
be followed by practice with feedback to ensure learning.)
- The materials to be analyzed are not too dissimilar in type of content
from the materials used as examples in training. (Case study materials
illustrate the application of the analysis criteria to benchmarks
in commonly taught topics in life, earth, and physical science. Analysts
have difficulties transferring the procedure to benchmarks less commonly
taught—for example, to benchmarks in the nature of science or
common themes. We are planning to develop additional case studies
to illustrate how the procedure applies to less familiar topics.)
Reviewers were enthusiastic about their involvement in the analysis work. Over
80% indicated on a follow-up questionnaire that they are interested in reviewing
other materials according to the 2061 procedure.
Results
Results of using the procedure are often surprising. Analysis teams find that
quick judgments about alignment to benchmarks or content standards are frequently
contradicted by a more rigorous analysis. This held for single units or across
several units in a program. A superficial examination most often overestimates
what a material can be expected to accomplish. For example, when educators
were asked how well River Cutters (a grade 6-9 curriculum module) addressed
benchmarks, their initial judgments were far more optimistic than their judgments
after completing the 2061 analysis. After initially listing 22 benchmarks
that a cursory read led them to suspect were addressed in River Cutters,
they found actual sightings for only 12 of them. After studying the meaning
of the benchmarks carefully and revisiting the sightings with this more sophisticated
understanding, they found that only 6 had a respectable content match. And
on considering the instructional strategy of the material, only 1 was found
to be instructionally well-supported, as shown below.
| Suspected Benchmarks |
Sighted Benchmarks |
Content- Matched Benchmarks |
Instructionally- Supported Benchmarks |
| 1A3-5#1 |
1A3-5#1 |
|
|
| 1B6-8#2 |
1B6-8#2 |
1B6-8#2 |
|
| 3B6-8#2 |
3B6-8#2 |
3B6-8#2 |
|
| 3C6-8#5 |
|
|
|
| 3C6-8#6 |
3C6-8#6 |
|
|
| 4C3-5#1 |
4C3-5#1 |
4C3-5#1 |
4C3-5#1 |
| 4C6-8#2 |
4C6-8#2 |
4C6-8#2 |
|
| 4C6-8#3 |
4C6-8#3 |
|
|
| 4C6-8#4 |
|
|
|
| 4C6-8#6 |
4C6-8#6 |
4C6-8#6 |
|
| 4C6-8#7 |
|
|
|
| 5F6-8#3 |
|
|
|
| 8C6-8#5 |
|
|
|
| 11B6-8#1 |
11B6-8#1 |
11B6-8#1 |
|
| 11B6-8#3 |
|
|
|
| 12A6-8#1 |
12A6-8#1 |
|
|
| 12A6-8#3 |
12A6-8#3 |
|
|
| 12B6-8#5 |
|
|
|
| 12C6-8#5 |
12C6-8#5 |
|
|
| 12D6-8#5 |
|
|
|
| 12E6-8#3 |
|
|
|
Figure 1: Benchmarks identified at different stages
of analysis: "Suspected benchmarks, "sighted" benchmarks,
content-matched benchmarks, instructionally-supported benchmarks. Suspected
benchmarks are identified after briefly looking at the material; sighted
benchmarks have identifiable treatment in the material; content-matched
benchmarks survive after the meaning of the benchmarks are clarified;
instructionally-supported benchmarks are rated "high" or "medium
on most of the instructional analysis criteria.
These results are meant less as a criticism of River Cutters (which
could certainly be modified to address more benchmarks) than to illustrate
how one can be easily mislead by a superficial analysis. A similar pattern
is obtained from analysis of a variety of K-12 curriculum materials. To the
extent that other evaluation procedures’ and developers’ claims
about their own materials are made at the suspected or at the "sightings"
level, their reports will not be credible.
The good news is that, through the use of the procedure, we have identified
highly credible materials that are likely to support students’ learning
of benchmarks and standards. For example, stand-alone units developed over
a period of years by the Institute for Research on Teaching and by the Michigan
Department of Education are well-aligned in terms of content and instruction.
These units, though not commercially polished, are readily available at relatively
low cost and could be used as is by those eager to get started. And some selected
units within elementary, middle, and high-school courses can be used as is
or readily modified to be better aligned with national learning goals. Even
a small module like River Cutters could contribute to students’
understanding of the utility of models by, for example, including question
sequences to guide student interpretation of their river runs and their reasoning
about the usefulness of their river cutters in understanding how real rivers
shape the earth.
Because of the effort required by the procedure, it would be helpful if large-scale
curriculum materials could be evaluated by means of sampling a few typical
units. Unfortunately, we have found some year-long and grade-range materials
to be quite uneven in their treatment of benchmarks and standards. Hence,
the analysis results from single units, whether favorable or unfavorable,
cannot be generalized to whole programs. Project 2061 is currently exploring
sampling techniques for grade-range curriculum materials that do not compromise
the reliability and validity of the analysis procedure and that still yield
a fairly accurate picture of the material. In addition, we are identifying
content and instructional analysis criteria that are especially important
for the analysis of programs.
An important consequence of involving a variety of educators in the development
and testing of the 2061 analysis procedure is the creation of a pool of reformers
who understand what alignment involves and who can bring that knowledge to
bear on their work. A cadre of K-12 educators now exists with increased knowledge
of specific learning goals in Benchmarks and NSES and with skills
to evaluate curriculum materials for their fit to these goals. Moreover, teacher
educators are adapting the procedure for use in preservice and professional
development programs, which will produce a larger cadre still.
Additional Considerations
Empirical Verification.
It is important to note that the 2061 analysis procedure, however meticulous,
produces judgments of the likelihood of effectiveness. The potential
of curriculum activities can be estimated by examining whether they address
specific, important learning goals (content match), and whether they are based
on effective principles of teaching and learning (instructional match) for
these goals. However, the "implemented" curriculum will depend on
teachers’ interpretations and use of the materials, and the "achieved"
curriculum will depend on individual students’ skills, interests, and
prior knowledge. Until activities are tried out with students, there will
not be hard evidence for what is actually learned. Yet sound studies of effects
on student learning are expensive and difficult to do. Such data are available
for only a handful of materials and, where available, they correspond well
to the results of the 2061 analysis.
Cost and support.
As noted earlier, the Project 2061 procedure concentrates on only some aspects
of materials analysis: their alignment with specified learning goals and learning
psychology. There are other important variables—such as affordability
and availability of publisher support—that influence the usefulness of
materials and whether or not they will even be used at all. Other evaluation
procedures currently in use focus on these other variables.
Uses of the Project 2061 Procedure
The discussion to this point might seem to suggest that the only purpose of
the Project 2061 analysis procedure is to improve decisions about the selection
of curriculum materials. That might indeed be its initial use. But the procedure
has other important uses, including:
- Identifying shortcomings in existing curriculum materials and suggesting
ways to improve them.
In the short term, schools and districts may not be able to replace all existing
materials with new ones. Indeed, given the short time that Benchmarks
and Standards have been widely available, in contrast to the much longer
time it takes to develop good materials, it is unlikely that sufficient materials
for building a K-12 science literacy curriculum currently exist. So for the
foreseeable future teachers and curriculum specialists in schools and districts
will have to continue the creative improvisations that they have made for
so long. The specific information provided about materials from a 2061 analysis
can help them with this task.
- Increasing teachers’ knowledge of characteristics of well-aligned
materials and developing skills to distinguish materials that are well-aligned
from those that are not.
All of the teachers who have been involved in the development of the procedure
(and even those exposed to it less rigorously during 1- or 2-day workshops)
tell us that the experience has changed forever how they look at curriculum
materials. They are less likely to assume alignment based on developers’
claims or a cursory look at materials themselves. Many claim that it is the
best professional development experience they’ve ever had, because it
highlights distinctions between effective and ineffective instruction toward
specific learning goals. Teacher educators who have participated in the project
are already building components of the training into their preservice teacher
preparation programs so that new teachers will start to develop these important
skills.
- Stimulating the development of and the market for well-aligned materials.
On the one hand, an important part of the Project 2061 rationale for involving
materials developers in the development of the analysis procedure was to encourage
them to attend to both content and instructional analysis criteria in the
new materials they are developing. If funders encourage and support the use
of these criteria then materials developers will be further encouraged to
use them. On the other hand, teachers who understand what constitutes well-aligned
materials are likely, as informed consumers, to increase the demand for materials
that are instructionally well aligned with Benchmarks and Standards.
But while it may be important for every teacher to have some experiences evaluating
materials, to have every teacher evaluating every material is not an effective
use of their time or the nation’s dollars. We recommend three levels
of involvement:
The first attends to immediate practicality. Some choices of materials have
to be made now in order to sustain the momentum of the systemic reform movement.
In the short run, the evaluation procedure we have researched is too demanding
for processing all of the materials that would fairly have to be included
in a resource pool. Yet people who are necessarily going to be using a simpler
procedure should be aware of what a more thorough evaluation would be like
and what sort of results it produces. That raised sensitivity should improve
the quality of simpler judgments that are made. A couple of days of tutored
practice—soon—would be enough for this beginning.
The second level looks ahead over the next couple of years. The number of qualified
evaluators has to grow enough to mount a more searching analysis and build
a base of reviews of well evaluated materials. A fairly large number of practitioners
should be trained over a period of a week or so and brought back together
periodically to share conclusions and assess reliability. A number of these
more experienced evaluators should be curriculum developers, so that they
can plan new projects (and revise old ones) that are aligned with standards
from the start.
The third level of involvement continues to improve the method itself, preparing
the way for more efficient evaluation of more diverse materials in the future.
This R&D will also provide updates to the second-layer work that can improve
and simplify it along the way. Since we believe that Project 2061 (thanks
to foresighted NSF funding) is well ahead on this, we see ourselves as playing
a leading role in that work, and we would welcome collaborators.
A Possible Approach: The Philadelphia Story
The Philadelphia School District had already committed itself to teaching toward
specific learning goals in Benchmarks and NSES and was searching
for suitable materials. Faced with the need to select materials fairly quickly,
it is using a strategy that combines the selection, fixing-up, and professional
development benefits of the 2061 analysis procedure.
Selecting materials.
A curriculum review committee (about a dozen K-12 teachers and teacher educators)
was trained to use the 4-step analysis procedure, used it to produce draft
analysis reports, compared their reports to others on the same material, and
attempted to reconcile, or at least account for, differences. This group will
review existing materials and recommend promising alternatives for district
use. Due to time constraints, their recommendations will be based on results
from only the first step in the 2061 procedure—eliminating materials
that do not appear to focus a significant amount of instruction on specific
learning goals. Although they acknowledge the desirability of subjecting materials
to both a content and instructional analysis before using them with students,
these more rigorous steps in the analysis procedure will be postponed and
used over time on the greatly reduced list that survives the preliminary inspection.
The committee will rely on members’ first-hand experience with the more
rigorous analyses to inform their preliminary judgments. Efforts will focus
on the materials already at least partially examined and found promising by
2061 analysis teams.
Fixing-up.
To use the 2061 procedure to improve existing materials, teachers will (a)
undertake the more rigorous analysis of the content and instructional match
of the chosen materials to benchmarks and (b) use the results of the analysis
as a basis for modifying the materials to better align them. This could include
such remedies as developing questions to focus students reflection on benchmark
ideas, adding activities to address reported student learning difficulties,
providing evidence-based arguments to foster student generalization of concepts,
and/or explicitly demonstrating how benchmark ideas are useful for making
sense of the students’ world outside the classroom.
Professional development.
At the same time, a larger and more diverse group of educators is becoming
knowledgeable, through a series of workshops, about specific learning goals
in Benchmarks and NSES and about the analysis criteria used
to judge materials in light of these goals. This knowledge will help them
to recognize both the strengths and weaknesses in existing curriculum materials
in terms of their treatment of specific learning goals -- important because
even relatively good materials still have some distance to go before they
are well aligned. As new, better aligned, materials become available the District
will have a cadre of informed consumers who can recognize and appreciate them.
Professional development of teachers is being extensively supported by USI,
Eisenhower, and other funds. To help improve curriculum and instruction in
both undergraduate and graduate college courses, the District will encourage
the college faculty themselves to develop knowledge of science literacy as
defined by Science for All Americans, to recognize the importance of
specific learning goals, and to use them in designing professional development
for Philadelphia teachers. In this way, both teachers pursuing graduate degrees
and new teachers will have significant blocks of time focused on science literacy.
References Cited
American Association for the Advancement of Science. (1997).
Resources for Science Literacy: Professional development. New York,
NY: Oxford University Press.
American Association for the Advancement of Science. (1993).
Benchmarks for Science Literacy. New York, NY: Oxford University
Press.
American Association for the Advancement of Science. (1989).
Science for All Americans. New York, NY: Oxford University Press.
Berkheimer, G.D., Anderson, C.W., Lee, O., and Blakeslee,
T.D. with Eichinger, D. and Sands, K. (1988). Matter and Molecules: Teachers
Guide: Science Book and Activity Book. East Lansing, MI: The Institute
for Research on Teaching, College of Education, Michigan State University.
Occasional paper No. 121.
Driver, R., Squires, A., and Wood-Robinson, V. (1994).
Making sense of secondary science: Research into children’s ideas.
New York, NY: Routledge.
National Research Council (1996). National Science Education
Standards. Washington, DC: National Academy Press.
Roseman, J. (1997). Implementing benchmarks and standards:
Lessons from Project 2061. The Science Teacher 64 (1), pp. 26-29.
Appendix A: A More Detailed Look at the Procedure
Preliminary Inspection.
Let’s assume for the moment that we are starting with materials that appear
promising—the content doesn’t appear too far outside the scope of
science literacy and the material includes lots of hands-on activities. The
task becomes listing some specific learning goals on which the material appears
to focus.
First, reviewers search fairly quickly through the material (both student material
and teachers guides) to make a preliminary list of all the specific learning
goals that would seem likely to be targeted. The material is then examined
more carefully to locate and record all places where each learning goal is
actually served—e.g., particular readings, experiments, discussion questions.
(A sighting must be explicit in the material.) Then, based on the number and
types of sightings, a decision is made about which benchmarks and standards
warrant a more careful analysis.
Content Analysis.
This analysis is a more rigorous examination of the link between the subject
material and the selected learning goals. This involves giving precise attention
to both ends of the match – the precise meaning of the benchmark on one
end, and the precise intention of the material on the other. The material
is examined with respect to such questions as:
Do the activities called for in the material address the substance of a specific
benchmark or only the benchmark’s general "topic?"
Do the activities reflect the level of sophistication of the specific benchmark
or are the activities more appropriate for targeting benchmarks at an earlier
or later grade level?
Do the activities address all parts of a specific benchmark, or only some?
If the latter, what is the consequence? (While it is not necessary that any
particular activity or unit would address all the ideas in a benchmark or
standard, the K-12 curriculum as a whole should do so. The purpose of this
question is to provide an account of precisely what ideas are treated.)
For the material as a whole an attempt is made to estimate the degree of overlap
between its content and the learning goals of interest. Thus it strives to
answer questions such as these:
Does the material address all benchmarks of interest for a given topic
and grade level? Which, if any, are not treated? Are they useful or even essential
to the development of already included benchmarks?
Does the material contain content not required for reaching science literacy
learning goals? If so, in what proportion? Does the material clearly distinguish
between the two kinds of content? (While distinguishing content essential
for literacy from non-essential content might seem to be a luxury in a material,
its presence increases the range of students for which the material can be
used. Distinguishing excess material makes it easier for the teacher to direct
better students to enrichment activities and allows students themselves to
avoid overload from ideas that go beyond.)
Instructional Analysis.
The purpose of the instructional analysis is to estimate how well the material
addresses targeted benchmarks from the perspective of what is known about
student learning and effective teaching. The criteria for making such judgments
are derived from research on learning and teaching and on the craft knowledge
of experienced educators. In the context of science literacy, summaries of
these have been formulated in CHAPTER 13: Effective Learning and Teaching
in Science for All Americans, in Chapter 15: The Research Base of Benchmarks
for Science Literacy, and of science education alone in CHAPTER 3: Science
Teaching Standards in National Science Education Standards.
From those sources, seven criteria clusters have been identified to serve as
a basis for the instructional analysis. (One could view these as standards
for instructional materials.) A draft of the specific questions within each
cluster is shown below. The proposition here is that (1) in the ideal all
questions within each cluster would be well addressed in a material –
they are not alternatives; and (2) this analysis has to be made for each
benchmark separately – if we are serious about having science literate
high school graduates then we want to focus effective instruction on every
single one of the important ideas in Science for All Americans.
Cluster I. Providing a Sense of Purpose:
Part of planning a coherent curriculum involves deciding on its purposes and
on what learning experiences will likely contribute to achieving those purposes.
But while coherence from the designers’ point of view is important, it
may be inadequate to give students the same sense of what they are doing and
why. This cluster includes criteria to determine whether the material attempts
to make its purposes explicit and meaningful, either by itself or by instructions
to the teacher.
Framing. Does the material begin with important focus problems, issues, or
questions about phenomena that are interesting and/or familiar to students?
Connected sequence. Does the material involve students in a connected sequence
of activities (versus a collection of activities) that build toward understanding
of a benchmark(s)?
Fit of frame and sequence. If there is both a frame and a connected sequence,
does the sequence follow well from the frame?
Activity purpose. Does the material prompt teachers to convey the purpose
of each activity and its relationship to the benchmarks? Does each activity
encourage each student to think about the purpose of the activity and
its relationship to specific learning goals?
Cluster II. Taking Account of Student Ideas:
Fostering better understanding in students requires taking time to attend to
the ideas they already have, both ideas that are incorrect and ideas that
can serve as a foundation for subsequent learning. Such attention requires
that teachers are informed about prerequisite ideas/skills needed for understanding
a benchmark and what their students’ initial ideas are -- in particular,
the ideas that may interfere with learning the scientific story. Moreover,
teachers can help address students’ ideas if they know what is likely
to work. This cluster examines whether the material contains specific suggestions
for identifying and relating to student ideas.
Prerequisite knowledge/skills. Does the material specify prerequisite knowledge/skills
that are necessary to the learning of the benchmark(s)?
Alerting to commonly held ideas. Does the material alert teachers to commonly
held student ideas (both troublesome and helpful) such as those described
in Benchmarks Chapter 15: The Research Base?
Assisting the teacher in identifying students’ ideas. Does the material
include suggestions for teachers to find out what their students think
about familiar phenomena related to a benchmark before the scientific ideas
are introduced?
Addressing commonly held ideas. Does the material explicitly address commonly
held student ideas?
Assisting the teacher in addressing identified students’ ideas. Does the
material include suggestions for teachers on how to address ideas that their
students hold?
Cluster III. Engaging Students With Phenomena:
Much of the point of science is explaining phenomena in terms of a small number
of principles or ideas. For students to appreciate this explanatory power,
they need to have a sense of the range of phenomena that science can explain.
"Students need to get acquainted with the things around them—including
devices, organisms, materials, shapes, and numbers—and to observe them,
collect them, handle them, describe them, become puzzled by them, ask questions
about them, argue about them, and then try to find answers to their questions."
(SFAA, p. 201) Furthermore, students should see that the need to explain
comes up in a variety of contexts.
First-hand experiences. Does the material include activities that provide first-hand
experiences with phenomena relevant to the benchmark when practical and when
not practical, make use of videos, pictures, models, simulations, etc.?
Variety of contexts. Does the material promote experiences in multiple, different
contexts so as to support the formation of generalizations?
Questions before answers. Does the material link problems or questions about
phenomena to solutions or ideas?
Cluster IV. Developing and Using Scientific Ideas:
Science for All Americans includes in its definition of science literacy
a number of important yet quite abstract ideas—e.g., atomic structure,
natural selection, modifiability of science, interacting systems, common laws
of motion for earth and heavens. Such ideas cannot be inferred directly from
phenomena and the ideas themselves were developed over many hundreds of years
as a result of considerable discussion and debate about the cogency of theory
and its relationship to collected evidence. Science literacy requires that
students see the link between phenomena and ideas and see the ideas themselves
as useful. This cluster includes criteria to determine whether the material
attempts to provide links between phenomena and ideas and to demonstrate the
usefulness of the ideas in varied contexts.
Building a case. Does the material suggest ways to help students draw from
their experiences with phenomena, readings, activities, etc. to develop an
evidence-based argument for benchmark ideas? (This could include reading
material that develops a case.)
Introducing terms. Does the material introduce technical terms only in conjunction
with experience with the idea or process and only as needed to facilitate
thinking and promote effective communication?
Representing ideas. Does the material include appropriate representations of
scientific ideas?
Connecting ideas. Does the material explicitly draw attention to appropriate
connections among benchmark ideas (e.g., to a concrete example or instance
of a principle or generalization, to an analogous idea, or to an idea that
shows up in another field)?
Demonstrating/modeling skills and use of knowledge. Does the material demonstrate/model
or include suggestions for teachers on how to demonstrate/model skills or
the use of knowledge?
Practice. Does the material provide tasks/questions for students to practice
skills or using knowledge in a variety of situations?
Cluster V. Promoting Student Reflection:
No matter how clearly materials may present ideas, students (like all people)
will make their own meaning out of it. Constructing meaning well is facilitated
by having students (a) make their ideas and reasoning explicit, (b) hold them
up to scrutiny, and (c) recast them as needed. This cluster includes criteria
for whether the material suggests how to help students express, think about,
and reshape their ideas to make better sense of the world.
Expressing ideas. Does the material routinely include suggestions (such as
group work or journal writing) for having each student express, clarify, justify,
and represent his/her ideas? Are suggestions made for when and how students
will get feedback from peers and the teacher?
Reflecting on activities. Does the material include tasks and/or question sequences
to guide student interpretation and reasoning about phenomena and activities?
Reflecting on when to use knowledge and skills. Does the material help or include
suggestions on how to help students know when to use knowledge and skills
in new situations?
Self-monitoring. Does the material suggest ways to have students check their
own progress and consider how their ideas have changed and why?
Cluster VI. Assessing Progress:
There are several important reasons for monitoring student progress toward
specific learning goals. Having a collection of alternatives can ease the
creative burden on teachers and increase the time available to analyze student
responses and make adjustments in instruction based on them. This cluster
includes criteria for whether the material includes a variety of goal-relevant
assessments.
Alignment to goals. Assuming a content match of the curriculum material to
this benchmark, are assessment items included that match the content?
Application. Does the material include assessment tasks that require application
of ideas and avoid allowing students a trivial way out, like using a formula
or repeating a memorized term without understanding?
Embedded. Are some assessments embedded in the curriculum along the way, with
advice to teachers as to how they might use the results to choose or modify
activities?
Cluster VII. Enhancing the Learning Environment:
Many other important considerations are involved in the selection of curriculum
materials—for example, the help they provide teachers in encouraging
student curiosity and creating a classroom community where all can succeed,
or the material’s scientific accuracy or attractiveness. Each of these
can influence student learning, even whether the materials are used. The criteria
listed in this cluster provide reviewers with the opportunity to comment on
these and other important features.
Teacher content learning. Would the material help teachers improve their understanding
of science, mathematics, and technology and their interconnections?
Classroom environment. Does the material help teachers to create a classroom
environment that welcomes student curiosity, rewards creativity, encourages
a spirit of healthy questioning, and avoids dogmatism?
Welcoming all students. Does the material help teachers to create a classroom
community that encourages high expectations for all students, that enables
all students to experience success, and that provides all different kinds
of students a feeling of belonging into the science classroom?
Connecting beyond the unit. Does the material explicitly draw attention to
appropriate connections to ideas in other units?
Other strengths. What, if any, other features of the material are worth noting?
Summary Report.
Having analyzed both the content and the instruction aimed at that content,
the final step in the process is to prepare a report that summarizes the material’s
treatment of specific benchmarks and, drawing on that evidence, comments more
generally on strengths and weaknesses of the material. Nonetheless, the report
stops short of an overall recommendation. It is for educators to decide whether
to use the material as is, to use it with modifications, or not use it at
all. A goal-centered analysis of the kind developed by Project 2061 should
help them make better decisions than would otherwise be possible.
Appendix B: Curriculum Materials Analyzed
K-5 units
Insights: Changes of State (EDC)
ESS: Where is the Moon? (EDC)
Science and Technology for Children: Food Chemistry (NSRC)
Nuffield Primary Science: Earth & Space (Collins Educational)
Nuffield Primary Science: Living Things in Their Environment (Collins
Educational)
FOSS: Models & Designs (LHS, Britanica)
FOSS: The Structures of Life (LHS, Britanica)
Used Numbers: The Shape of the Data (TERC)
Grades 6-8 Units
Changes in Matter (Macmillan)
Food, Energy, and Growth (Michigan Department of Education)
GEMS: River Cutters (LHS)
Matter and Molecules (Institute for Research on Teaching, MSU)
Power Plant (Institute for Research on Teaching, MSU)
Science 2000 (D.C. Heath)
Science Focus: The Salters’ Approach: Drinks (Heinemann)
SciencePlus: Life Processes (Holt, Rinehart, Winston)
SEPUP: Issues, Evidence and You (LHS)
Technology Units
Materials World Modules: Composites (Northwestern University)
Nuffield Design and Technology (Collins Educational)
Introduction To Design & Technology: Control Technology Systems
(Todd et. al., Taylor)
TSM Integration Project: Cabin Insulation (LaPorte and Sanders, Glencoe)
Middle School Program
SciencePlus (Holt, Rinehart, Winston)
Grades 9-12 Biology Units
Insights in Biology: The Matter of Life (EDC)
Biological Science: A Human Approach: Evolution (BSCS, Kendall Hunt)
Biological Science: An Ecological Approach (BSCS, Kendall Hunt)
Heath Biology: Unit II (D.C. Heath)
Grades 9-12 Chemistry Units
ChemCom: Conserving Chemical Resources (American Chemical Society, Kendall
Hunt)
Chemistry that Applies (Michigan Department of Education)
Salters Chemistry: Burning & Bonding
Visualizing Matter: Atomic Structure (Holt)
Grades 9-12 Physics Units
Active Physics: Predictions (AAPT)
Project Star: Chapters 10-15 (Coyle et. al., Kendall Hunt)
Conceptual Physics: Chapters 2-6 (Hewitt, Addison-Wesley)
Roseman, J. E., Kesidou, S., and L. Stern (1997). Identifying Curriculum Materials
for Science Literacy. A Project 2061 Evaluation Tool. Based on a paper prepared
for the colloquium "Using the National Science Education Standards to
Guide the Evaluation, Selection, and Adaptation of Instructional Materials."
National Research Council, November 10-12, 1996.