Project 2061 Logo

Middle Grades Mathematics Textbooks

A Benchmarks-Based Evaluation

Part 2 (Continued)
The Evaluation in Detail

To provide a better sense of how the evaluation was conducted, the following describes in more detail each step in the curriculum-materials analysis procedure. The description includes examples of how the procedure was applied to obtain ratings of the textbooks. The method of analysis, including training reviewers, assigning textbooks to review teams, and other information about the review process is available in Appendix C.

Step 1. The analysts examine each textbook activity—a lesson or part of a lesson—that matches the content of the benchmark.
Content analysis begins with a careful page-by-page survey of the textbook’s activities, lessons, exercises, and other learning opportunities to find those that address the six benchmarks selected for the evaluation. Analysts use a software utility to keep track of what they find. For each activity "sighting," analysts describe the activity’s purpose and location and explain how it addresses the content of the benchmark.

Step 2. Analysts determine the extent to which an activity addresses the benchmark concept or skill.
To evaluate the link between an activity’s content and the content of selected mathematics benchmarks requires analysts to pay close attention to the precise meaning of the benchmark on one end and the precise intention of the textbook on the other. They must answer the following key questions about each activity sighting: Does the activity address the specific substance of the benchmark or only its general "topic"? Does the mathematics content in the activity reflect the level of sophistication of the benchmark, or are the activities more appropriate for an earlier or later grade level? What specific parts or ideas of a benchmark does the textbook cover?

The following examples illustrate these content alignment requirements for the grade 6-8 benchmark: Symbolic equations can be used to summarize how the quantity of something changes over time or in response to other changes. (Chapter 11C, grades 6-8, benchmark 4, p. 274)

  • Topic vs. Substance: The topic of the standard seems to be "symbolic equations." If we study the benchmark carefully, however, we see that it is really about how equations summarize changes in quantities. To address the substance of the benchmark, an activity would need to involve students in using equations to represent and describe how a quantity such as daily temperature changes over months of the year, explaining patterns in the changes, or using the equation to make predictions. Activities that involve students only in writing, simplifying, evaluating, or solving equations do not necessarily align with this benchmark.
  • Sophistication: An activity or lesson may align with a benchmark at an earlier or later grade level rather than the intended one. For example, a lesson that involves students in exploring patterns of change without using symbolic equations would not be at the appropriate level for the grades 6-8 benchmark stated above. On the other hand, a lesson that focused on manipulating the variables or combining equations to find solutions would be more appropriate for aligning with a benchmark for later grades.
  • Part or whole: There is nothing wrong with an activity addressing only a part of a benchmark, but it is important to know exactly which part the activity addresses. The benchmark above contains two main ideas: changes over time, and changes in response to other changes. For example, an activity in which students write equations that express the relationship of the area of a circle to its radius addresses the second part of the benchmark, not the whole benchmark.

Step 3. Analysts decide which of the 24 instructional criteria apply to the activities they have identified throughout the textbook and then determine whether the activity meets the indicators for each criterion.
The purpose here is to estimate how well the activity addresses the targeted benchmark from the perspective of what is known about student learning and effective teaching. Rather than looking at the textbook’s instructional design as a whole, analysts must consider whether the instructional strategies that relate to an activity will help students learn the specific concepts and skills contained in one of the six benchmarks used in the evaluation.

Step 4. Based on the number of indicators met, analysts rate the activity on a scale of 0 to 3 for each criterion.
There are as few as 2 or as many as 6 indicators for each instructional criterion. The scoring scheme guides analysts in using the indicators to assign ratings. Criterion ratings are based on which indicators are met.

The following diagram illustrates the steps that analysts used to evaluate the quality of instruction for each mathematics benchmark presented in typical textbook activities.


 PDF Icon(Adobe PDF document)

Ensuring Reliability

To prepare for the evaluation, Project 2061 tested its curriculum-materials analysis procedure for consistency of results from analyst to analyst. In this reliability study, 14 analysts who had received extensive training in the procedure independently evaluated two sets of middle grades mathematics materials. There was agreement on 80% of the analysts’ ratings on one set and 97% on the other (Kulm & Grier, 1998). These results provided sufficient confidence in the procedure’s reliability to proceed with the full-scale evaluation of middle grades mathematics textbooks. The analysis procedure continued to produce a high level of rater agreement across all of the benchmarks and all of the textbooks. For 13 textbook series (including the Saxon series), the average rater agreement was 90 percent, with a range from 80 to 100 percent.

Back Arrow

Next Arrow
Part II continued