Motion and Forces Curriculum Unit 1 RUNNING HEAD ... - CiteSeerX

2 downloads 0 Views 390KB Size Report
Apr 6, 2006 - The impact of a middle school motion and forces curriculum unit on .... implementation of the curriculum materials, we hope to propose a way of ...
Motion and Forces Curriculum Unit 1

RUNNING HEAD: Motion and Forces Curriculum Unit

The impact of a middle school motion and forces curriculum unit on student outcomes: Results from consecutive quasi-experimental studies. Robert Ochsendorf Curtis Pyke Sharon Lynch William Watson The George Washington University Paper presented at the Annual Meeting of the National Association for Research in Science Teaching San Francisco, California April 3-6, 2006

This work is supported by the National Science Foundation, the U.S. Department of Education, and the National Institute of Health (REC-0228447). Any opinions, findings, conclusions, or recommendations are those of the authors and do not necessarily reflect the position, policy of endorsement of the funding agencies.

Motion and Forces Curriculum Unit 2 Introduction In education research, there has recently been a call for more scientifically-based research using experimental designs and replication studies as way to enhance knowledge accumulation and theory building across contexts (Mosteller & Boruch, 2002; National Research Council, 2005). Replications are important because they require the application of similar conditions for multiple cases, and if results are consistent, help justify the generalizations and support theory building. Without convergence of results from multiple studies, the objectivity, neutrality, and generalizability of research efforts are questionable (National Research Council, 2005). Such research is clearly in the public interest if progress is to be made in building the knowledge base in education research. This, in turn, should lead to either improved student outcomes, or to the abandoning of non-productive and costly educational interventions that simply don’t work. Education researchers and other social scientists have conducted replication studies on various programs and interventions. However, the use of the term replication by researchers has been highly inconsistent. Some replication studies have involved repetition of the study methodology in different settings with entirely new sets of participants (i.e., students or teachers) (Schneider, 2003). In these kinds of studies, replication involves applying the same conditions and designs to multiple cases that are sufficiently different to justify the generalization of results in theories. Often, the goal of such replications is to advance the generalizability of the program or intervention by establishing effectiveness across a variety of contexts with an array of individuals (see for example Hubbard, Giese & Rainey, 1998). By focusing on the replication of results, these kinds of studies have the potential to contribute to the objectivity, neutrality, and generalizability called for by the National Research Council.

Motion and Forces Curriculum Unit 3 Other researchers have used the term replication to describe a different notion. For example, Adelman and Taylor describe replication as being concerned with “prototype development and widespread diffusion of new approaches to schooling” (sometimes called roll-out or scale-up) (1997). Their diffusion model of replication has the goal of advancing “state of the art” educational programs and draws on the psychological and organizational literature. This view of replication seems somewhat less concerned with replication of results but more concerned with simply replicating the program itself. Somewhat related to Adelman & Taylor’s notions of replication is Bradach’s view of replication. Bradach describes replication as “the movement of an organization’s theory of change to a new location” and he equates replication in the social sector with franchising in the private sector (2003). Bradach’s model of replication, however, is concerned with the initial collection of substantive evidence of success to justify replication followed by the subsequent replication of that program’s success. Our view of replication is consistent with the views of Schneider and the National Research Council (NRC). Despite the NRC’s call for replication of studies as a way to promote knowledge accumulation, few replication studies currently exist in the literature. The research presented here, however, seeks to replicate a quasi-experimental study that explores the effectiveness of science curriculum materials. In this study we draw on data from the same group of teachers and schools (the student cohort does change) for two consecutive years of curriculum unit implementation. With this kind of replication, we hope to not only confirm the results obtained in the first year of implementation by using an additional study cohort but also to learn more about the “experience effect” of the intervention as it is implemented with the same teachers and schools. That is, we seek to study the effectiveness of curriculum materials as they are implemented in consecutive

Motion and Forces Curriculum Unit 4 years. This kind of replication study not only allows for knowledge accumulation regarding the effectiveness of curriculum materials, but it also provides an opportunity to explore the experience of the intervention as it moves from the first year of implementation to the second year of implementation. Current research suggests that study participants are likely to improve in their implementation of an intervention when moving from the first to the second attempt. Geier found that teachers continue to strengthen their use of reform-based curriculum materials through their second and third years using the units (2005). Another study showed that middle school chemistry students in classrooms with teachers that are more experienced with a given set of curriculum materials had higher gain scores than students in classrooms with no teacher experience with the same materials (Fogleman & McNeil, 2005). Similarly, this research attempts to explore patterns in student outcomes as teachers and schools move from their first year of implementation to the second year of implementation. By focusing on comparisons made between the first year and second year of implementation of the curriculum materials, we hope to propose a way of discussing the effects of experience on the implementation of curriculum materials. In this research, experience is taken to mean the familiarity that develops as teachers, schools, and school systems move from the first year of curriculum materials implementation to subsequent years of implementation. Familiarity, for example, can develop as teachers become more comfortable with the sequence of lessons, are better able to anticipate questions from students or problems with materials, or are more capable of predicting likely outcomes of experiments and patterns in data. Teachers are also more likely to focus on student learning in later implementations as compared to the first year when more time and energy is spent on logistics and simply completing the lessons. In addition, schools in later years of

Motion and Forces Curriculum Unit 5 implementation become more efficient at assembling and working with the materials involved in carrying out investigations or figuring out how school scheduling either enhances or hinders implementation. Likewise, school systems are likely to clarify and distribute knowledge gained through an earlier implementation via professional development sessions or other interactions between and among stakeholders and members of the community. These are a few of the ways in which it is hypothesized that experience is likely to positively influence the implementation of curriculum materials beyond the initial implementation. This research will not operationalize or attempt to measure experience variables other than to note that the same schools and many of the same teachers implemented the curriculum materials for two consecutive years. In addition, by focusing on a second year of implementation, we hope to collect data that will allow for the fairest possible judgments to be made regarding the effectiveness of the intervention. That is, with the first year of any program implementation it is likely that a nontrivial level of adjustment and orientation are necessary on the part of implementers. Therefore, by collecting the same outcome data in the second year of implementation, it is perhaps more likely to achieve a fair test of the intervention’s effectiveness as it is implemented as compared to the initial implementation. This approach is consistent with Chatterji’s (2004) call for more complete and effectual designs so that research is conducted within a longer-term framework that attempts to capture the life of the program or intervention. The scientifically-based research presented here focuses on the effectiveness of a set of science curriculum materials. Currently in science education there exists little scientifically-based research evidence on the effectiveness of curriculum materials (i.e., experimental or quasi-experimental designs). Much of the research on curriculum

Motion and Forces Curriculum Unit 6 materials tends to focus on design studies or case-level analyses that do not use comparison or control groups to support their claims regarding the quality or true effectiveness of a given intervention. Effectiveness studies that do use comparison groups have been conducted on math curriculum materials (see Senk & Thompson, 2003) but very few studies exist in science education. A brief literature review conducted by the authors of this paper turned up only a few research studies of science curriculum materials that employed experimental designs with comparison groups. Thus, the work presented in this paper seeks to provide scientifically based evidence to test the effectiveness of science curriculum materials by examining two consecutive years of implementation data. Given the need for this kind of research in science education, this paper seeks to answer the following research questions on the effectiveness of a curriculum unit, Motion and Forces (Harvard-Smithsonian Center for Astrophysics, 2001), in the context of two consecutive implementation studies: Implementation Questions 1. Does the Motions and Forces curriculum unit produce higher outcome scores on measures of concept understanding, motivation, and learning engagement? 2. Does disaggregating outcome data reveal important outcome patterns not captured by aggregate mean scores? Cross Study Comparison Questions 3. Does the effectiveness of the curriculum unit increase as schools and teachers become more experienced implementing it from the first year to second year? 4. Are changes in the relationship among achievement, motivation and engagement observed from the first year to second year?

Motion and Forces Curriculum Unit 7 The intervention in this study is a curriculum unit entitled Exploring Motion and Forces: Speed, Acceleration, and Friction (Harvard-Smithsonian Center for Astrophysics, 2001), and will be referred to simply as Motion and Forces, hereafter. This unit was developed for the ARIES curriculum program by the Science Education Department at the HarvardSmithsonian Center for Astrophysics. It is a six-week physical science curriculum unit designed for grades 5-8. These curriculum materials are inquiry-centered and activitybased, with an emphasis on students’ direct experience with phenomena. A substantial amount of materials and supplies are required to implement the unit. Students use materials to construct the tools for conducting the Motion and Forces investigations including sliding disks, ramps, and rolling carts. Motion and Forces focuses on a middle school science target concept from Benchmarks for Science Literacy (AAAS, 1993): Changes in speed or direction of motion are caused by forces. The greater the force is, the greater the change in motion will be. The more massive an object is, the less effect a given force will have (p. 89). An object at rest stays that way unless acted on by a force…an object in motion will continue to move unabated unless acted on by a force (AAAS, p. 90). Figure 1 shows this force and motion concept idea situated relative to other force and motion ideas at the middle school level. Theoretical Framework This paper draws on research that views the design of science instruction as a means to support conceptual change learning (Hewson, 1981; Hewson & Hewson, 1984; Strike & Posner, 1992), provided that the instruction is thoughtfully designed with this goal

Motion and Forces Curriculum Unit 8 in mind. Conceptual change theory suggests that students approach science concepts with naïve ideas that may be readily changed through instruction. Conceptual change is the process by which students move from naïve knowledge states to more sophisticated, scientific understandings of the world. Also, it is understood that instruction in many classrooms is strongly influenced by the kinds of curriculum materials (e.g., textbooks, manuals) that are being used by the teacher (Goodlad, 1984). Teachers’ reliance on curriculum materials may be even more pronounced in science classrooms (Ball & Cohen, 1996; Lynch, 2000). Recent research on conceptual change has begun to explore the role of achievement goals (motivation) and engagement in promoting conceptual change (Linnenbrink & Pintrich, 2003; Pintrich, Marx & Boyle, 1993; Strike and Posner, 1992). Students’ achievement goal orientation is thought to be influenced by the contexts in which the students learn. A major premise of this study is that students’ use of highly rated curriculum materials is a crucial contextual factor that supports conceptual change for students who experience the unit. Achievement goal theory suggests that while performance goals are helpful in promoting student learning, these goals alone do not consistently promote the type of student engagement required to facilitate conceptual change. On the other hand, mastery learning goals (alone or with performance goals) seem to be a factor that sets the stage for conceptual change (Linnenbrink & Pintrich, 2003). This paper will contribute to this research by exploring patterns of motivation and engagement as they relate to conceptual change in diverse middle school science classrooms implementing highly rated curriculum materials. At the middle school level, many science topics are difficult to learn because of students’ persistent naïve understandings of these abstract ideas. The topics of forces and

Motion and Forces Curriculum Unit 9 motion are among the more difficult concepts for students to understand at the middle level. Students’ intuitions and everyday experiences often impede the learning of the scientifically correct, Newtonian models of the target concepts (Champagne, Gunstone, and Klopfer, 1985; Gunstone and Watts, 1985). In fact, understanding an abstract concept such as force is difficult even for high school students (Minstrell, 1989). However, some research suggests that it is possible to change middle school students’ intuitive concepts of motion and forces (White & Horwitz, 1987). This study is based on an assumption that a curriculum unit designed to promote conceptual change, such as Motion and Forces, can make a difference in science teaching learning and hopefully scaffold students’ learning. But such curriculum materials need to be identified as having such potential for promoting and scaffolding learning. One way of determining this potential is to determine if the materials meet the Project 2061 Curriculum Analysis criteria. Consequently, before implementing Motion and Forces, researchers in this study convened a group of raters to analyze Motion and Forces using the Project 2061 Curriculum Analysis (Kesidou & Roseman, 2002; Roseman, Kesidou & Stern, 1996). The Curriculum Analysis includes a thorough Content Analysis that determines which specific concepts the unit targets, and an Instructional Analysis, which profiles how well the unit’s instructional strategies support student learning of those target concepts. For Motion and Forces, the resulting set of ratings indicated that the materials fulfilled many of the criteria required by the Analysis process (i.e., providing a sense of purpose, engaging students with relevant phenomena, and encouraging students to explain their ideas), and were somewhat similar to other highly rated middle school curriculum units (For complete results of the Motion and Forces Curriculum Analysis see Ochsendorf, 2003).

Motion and Forces Curriculum Unit 10 This research is also guided by concerns of equity in science education reform. There is a growing body of evidence from national studies of performance in U.S. schools (i.e., NAEP & NELS) to suggest that the achievement gap in math and science is resistant to change despite reform efforts and is perhaps widening in recent years. Among many factors, one of the potential reasons for poor performance in math and science may have to do with the quality of the curriculum materials being used. Therefore, this research hypothesizes that high quality curriculum materials, if implemented with fidelity, could help all students learn science and perhaps have positive impacts on traditionally underserved groups of students. Design and Procedure The study sample was taken from the 6th grade student population in Montgomery County Public Schools (MCPS), a large Maryland school district (approximately 136,000 students total, 32,000 in grades 6-8) located in the Washington, DC metropolitan area. MCPS is rapidly becoming more culturally, linguistically, and socioeconomically diverse. In this study, two different cohorts of 6th graders in the same five matched pairs of schools formed the study samples, for two consecutive implementations of Motion and Forces during the 2003-2004 and the 2004-2005 school years (we will refer to these implementations as Year 2 and Year 3 as they are situated within a larger six year research agenda). Each matched pair of middle schools was randomly assigned to implement the intervention (Motion and Forces) or to use a standard menu of options (the Comparison group). This sampling method was expected to produce two equivalent samples, representative of the study population, with enough students to provide power for significance tests on disaggregated subgroups (Year 2 N = 2172, Year 3 N = 2498).

Motion and Forces Curriculum Unit 11 Students were given a Motion and Forces Assessment (MFA) at pretest and posttest to measure their conceptual understanding of the target concept (see Figure 1). The MFA is a curriculum independent assessment composed of ten selected and constructed response items designed for ease of use and diverse classroom settings. It uses language and illustrations that allow it to be read and understood by a maximum number of 6th grade students. MFA was administered both before and after instruction to provide pre and post measures of student understanding. Cronbach’s alpha for the ten items of the MFA used in calculating the weighted score indicated the internal consistency of the assessment (α = 0.52). The weighted score range is 0 to 100. A standard setting process was undertaken to determine cut scores that distinguish among four levels of understanding: flexible understanding (scores ranging from 70 to 100), some fluency with ideas (50-70), contextlimited understanding (20-50), and no understanding (0-20). In addition, a student questionnaire to measure student learning engagement (basic and advanced) and motivation (mastery goal orientation and performance goal orientation) was adapted from Marks (2000) and from Midgely, et al., (2000). For more information on the motivation and engagement instrument, scales, and variables, see Lynch, Kuipers, Pyke & Szesze (2005). The questionnaire was administered before and after instruction with Cronbach’s alpha on all scales ranging from 0.69 to 0.83. With minor changes, both the MFA and the questionnaire instrument were consistent across the two years of implementation studies. To answer the first two research questions, data in Years 2 and 3 from treatment and comparison groups were analyzed using Analysis of Covariance (ANCOVA), with pretest measures used as covariates. To answer Research Question 4, regression analyses

Motion and Forces Curriculum Unit 12 were conducted using data from the treatment group in both years. All analyses were conducted using SPSS for Windows, Version 12.0. Results This paper presents overall and disaggregated student outcome data on achievement, motivation, and engagement from the Year 2 (2003-2004) implementation of Motion and Forces in ten schools (N = 2172) and data from Year 3 (2004-2005) implementation in the same ten schools (N = 2498). Answering of research questions 3 and 4 will require analysis of data across both years of the study. Research Question 1. Year 2 Results Analysis of the Year 2 experimental data for MFA shows higher mean scores overall at posttest (M=55.87) than at pretest (M=41.54). MFA scores are generally higher when the experimental curriculum is used than when the comparison curriculum is used. Overall the Treatment group outperformed the Comparison group on the MFA outcome measure, F(1, 2169) = 6.44, p < .05, ES = .10. While the mean score for the Treatment condition was significantly higher than the mean score for the Comparison condition, when levels of understanding are considered, both mean scores fall within the same level of understanding as determined by the standard setting process (i.e. some fluency with concepts, 50-70) (Figure 2). There were no overall differences between Treatment and Comparison on motivation or engagement scores. In general though, relatively high mastery goal orientation and learning engagement but low performance orientation were observed in the study sample (see Table 1 for descriptive data on all variables in Year 2).

Motion and Forces Curriculum Unit 13 Year 3 Results However, analysis of the Year 3 experimental data for MFA yielded somewhat different results. As in Year 2, scores were higher at posttest (M=51.87) than at pretest (M=38.06) but both pretest and posttest means were not as high as in Year 2. Again, mean posttest scores for both curriculum conditions were within the same level of understanding (some fluency with concepts), but in Year 3 ANCOVA indicated no statistically significant effect for curriculum condition, F(1, 2251) = 2.546, p > .05. For a second time, relatively high mastery goal orientation and learning engagement but low performance orientation were observed in the study sample. Basic learning engagement mean scores were higher in the treatment condition than in the comparison condition but no other overall differences between Treatment and Comparison were found on these scales (see Table 1 for descriptive data on all variables in Year 3). Research Question 2. Disaggregated data show patterns across subgroups that may not be revealed in the overall data. Subgroups of interest include: Gender, Ethnicity, Free and Reduced Meals Status (FARMS), English for Speakers of Other Languages (ESOL) and Special Education Status (SPED). Results will be presented first for Year 2 and then for Year 3. Year 2Results When data were disaggregated, ANCOVA for MFA indicated no statistically significant interaction between curriculum condition and gender. However, there was a significant main effect for gender, F(1, 2167) = 8.357, p < .05. These results suggest that the male subgroup had significantly higher MFA means than the female subgroup. ANCOVA indicated a statistically significant interaction for MFA outcomes between curriculum condition and ethnicity, F(3, 2163) = 3.907, p < .05. Follow-up tests

Motion and Forces Curriculum Unit 14 were conducted to determine simple main effects for the interaction. These tests revealed that: a) White subgroup MFA mean scores were significantly higher in the treatment condition than in the comparison condition; b) in the treatment condition, Asian American and White subgroup means were significantly higher than the Hispanic and African American subgroup means; c) in the comparison condition, the Asian American and White subgroup means were significantly higher than the Hispanic and African American subgroup means (see Figure 3). ANCOVA indicated a statistically significant interaction between curriculum condition and FARMS, F(3, 2165) = 8.094, p < .05. Follow-up tests were conducted to determine simple main effects for the interaction. These tests revealed that: a) the Never FARMS subgroup mean was significantly higher in the treatment condition than in the comparison condition; b) in the treatment condition, the Never FARMS subgroup mean was significantly higher than the Prior FARMS and Now FARMS subgroup means; c) in the comparison condition, the Never FARMS subgroup mean was significantly higher than the Prior FARMS and Now FARMS subgroup means (see Figure 4). ANCOVA indicated no statistically significant interaction between curriculum condition and ESOL but there was a statistically significant main effect for ESOL. Followup tests indicated that the Never ESOL subgroup mean was significantly higher than the Prior ESOL and Now ESOL subgroup means across both conditions. ANCOVA indicated a statistically significant interaction between curriculum condition and SPED, F(1, 2167) = 5.66, p < .05. Follow-up tests revealed that: a) the Never SPED subgroup mean was significantly higher in the treatment condition than in the comparison condition; b) in the treatment condition, the Never SPED subgroup mean was

Motion and Forces Curriculum Unit 15 significantly higher than the Now SPED subgroup mean; c) in the comparison condition, there were no significant differences between subgroups. Effect sizes for disaggregated subgroups are presented in Table 2. In addition, Figures 3 and 4 show the adjusted posttest means for the Ethnicity and FARMS subgroups by curriculum condition. Data for motivation and engagement were also disaggregated. Females in the treatment condition scored significantly higher on performance orientation than in the comparison condition. Basic learning engagement was higher for the Never FARMS and Now FARMS subgroups in the treatment condition than in the comparison condition, but the Never FARMS subgroup showed significantly higher basic learning engagement than the Now FARMS subgroup in the treatment condition. Year 3 Results When data were disaggregated, ANCOVA indicated a statistically significant interaction between curriculum condition and gender, F(1, 2249) = 5.084, p < .05. Followup tests revealed that: a) the male subgroup mean was significantly higher in the comparison condition than in the treatment condition; b) in the treatment condition, there were no significant differences between male and female subgroups; c) in the comparison condition, the male subgroup mean was significantly higher than the female subgroup mean. ANCOVA indicated a statistically significant interaction between curriculum condition and ethnicity, F(3, 2245) = 9.021, p < .05 (see Figure 5). Follow-up tests revealed that: a) the Asian American subgroup mean was significantly higher in the treatment condition than in the comparison condition; the African American and Hispanic subgroup means were significantly lower in the treatment condition than in the comparison

Motion and Forces Curriculum Unit 16 condition; b) in the treatment condition, the White and Asian American subgroup means were significantly higher than the African American and Hispanic subgroup means; c) in the comparison condition, the White subgroup mean was significantly higher than all other subgroup means, and the African American subgroup mean was higher than the Hispanic subgroup mean. ANCOVA indicated a statistically significant interaction between curriculum condition and FARMS, F(2, 2247) = 10.313, p < .05 (see Figure 6). Follow-up tests revealed that: a) the Now FARMS subgroup mean was significantly lower in the treatment condition than in the comparison condition; b) in the treatment condition, the Never FARMS subgroup mean was significantly higher than the Prior FARMS and Now FARMS subgroup means; the Prior FARMS subgroup mean was significantly higher than the Now FARMS subgroup mean; c) in the comparison condition, the Now FARMS subgroup mean was significantly lower than the Never FARMS and Prior FARMS subgroup means. ANCOVA indicated no statistically significant interaction between curriculum condition and ESOL. However, there was a statistically significant main effect for ESOL, F(2, 2247) = 17.754, p < .05. These results suggest that the magnitude of the effect of the curriculum unit for each ESOL subgroup was similar. Follow-up tests indicated that the Never ESOL subgroup mean was significantly higher than the Prior ESOL and Now ESOL subgroup means and that the Prior ESOL subgroup mean was significantly higher than the Now ESOL subgroup mean. ANCOVA indicated no statistically significant interaction between curriculum condition and SPED. However, there was a statistically significant main effect for SPED, F(1, 2249) = 33.607, p < .05. These results suggest that the magnitude of the effect of the curriculum unit for each SPED subgroup was similar. The subgroup means indicated that

Motion and Forces Curriculum Unit 17 the No SPED subgroup scored significantly higher than the Now SPED subgroup on the MFA. The Now SPED subgroup displayed a lower level of understanding of target concepts than the No SPED subgroup. Differences between subgroup means appear to be about the same in both experimental conditions. For the Year 3 data presented above, effect sizes on MFA scores for all disaggregated subgroups are shown in Table 4. Year 3 disaggregated data on motivation and engagement outcomes revealed: a) in the aggregate, basic learning engagement mean scores are higher in the treatment condition than in the comparison condition, b) Now FARMS subgroup performance approach mean is significantly higher in the treatment condition than in the comparison condition, c) the Now FARMS subgroup advanced learning engagement mean is significantly higher in the treatment condition than in the comparison condition, and d) the Never FARMS subgroup advanced learning engagement mean is significantly lower in the treatment condition than in the comparison condition. Research Question 3 From the data collected in Years 2 and 3, it appears that the effectiveness of the curriculum unit does not increase as teachers and schools become more experienced with the unit. First, lower overall posttest scores were observed in the treatment condition in Year 3 as compared to Year 2. In Year 2, treatment group performance on the posttest improved 15.65 points over the pretest score while in Year 3 treatment group performance improved only 12.98 points from pretest to posttest (see Table 1). This is further supported by effect size data from both years (Table 2). The overall effect size declined from .10 in Year 2 to -.06 in Year 3. In addition, patterns of effect sizes for subgroups suggest that the effectiveness of the curriculum unit for particular subgroups does not increase with experience. For example, the negative effect sizes observed for African American and

Motion and Forces Curriculum Unit 18 Hispanic subgroups in Year 2 are persistent in Year 3 and in fact become more negative. The same is true of the Year 2 and Year 3 effect sizes for the Prior and Now FARMS subgroups as well. Research Question 4 In Year 2 it appears that some subgroups that have among the highest levels of mastery goal orientation are among the lowest scoring on the MFA and relatively high levels of basic learning engagement are related to better performance on the MFA. In Year 3 as well, it appears that subgroups with the highest levels of basic learning engagement appear to have the highest MFA mean scores. To further address this question, regression analyses were conducted using data from the treatment condition only (Motion and Forces). For Year 2, results of stepwise regression analysis with all posttest measures of goal orientation, engagement, and MFA performance indicate that basic learning engagement and mastery goal orientation are predictors of concept learning (MFA). Results of the F test of the R square change are statistically significant, F(1, 1034) = 8.23, p=0.004 with both predictor variables in the model. Basic learning engagement is a significant positive predictor of MFA outcomes while mastery goal orientation is a negative predictor. Similar regression analysis for Year 3 revealed that basic learning engagement and advanced learning engagement are predictors of concept learning (MFA). Results of the F test of the R square change are statistically significant, F(1, 904) = 13.95, p=0.000, with basic learning engagement as a positive predictor of MFA outcomes and advanced learning engagement as a negative predictor of MFA outcomes.

Motion and Forces Curriculum Unit 19 Discussion and Conclusions The experimental results from the Year 2 implementation of Motion and Forces suggest that students given the opportunity for concept learning with highly rated curriculum materials outperform their peers who learn with existing curriculum materials. However, there is little practical significance between these groups. Disaggregated data reveal main effects, interaction effects, and greater effect sizes for certain subgroups in the treatment condition (White, Asian American, Never FARMS). The subgroups of greatest concern (Hispanic, African American, Now FARMS) did not gain as much as their peers in the comparison condition when learning with the Motion and Forces curriculum materials. In Year 3 however, there is no statistical or practical significance overall between Treatment and Comparison groups on the concept learning measure. Year 3 effect sizes show that the Motions and Forces curriculum is effective for the Female, Asian American, White, and Never FARMS subgroups. On the other hand, in Year 3 Motion and Forces appears to have the opposite effect than expected for the Male, African American, Hispanic, and Prior and Now FARMS subgroups when the MFA scores are considered relative to their peers in the comparison condition. In summary, outcome data from consecutive quasi-experimental studies with Motion and Forces provides some evidence that the materials may be exacerbating achievement gaps in the study sample. These kinds of data patterns have not been observed with Motion and Forces before. In fact, very little evidence exists at all to address the effectiveness of science curriculum materials across different subgroups of students in a comparative way. This is due to the lack of scientifically-based evidence being collected in the development, piloting, or research phases of such curriculum materials. Design studies or those lacking a matched comparison group would not have uncovered the same concerning patterns seen here. Such studies

Motion and Forces Curriculum Unit 20 typically examine pre-post measures of student learning in the treatment group to conclude that overall, students gain in their conceptual understanding. It is clear that if progress is to be made in closing achievement gaps in science (and math), researchers must continue to closely examine the ways in which new curriculum materials (and other interventions) interact with various subgroups of students to produce patterns in student outcome data. Furthermore, given the need for science education reform that is explicitly aimed at reducing the achievement gap, the patterns revealed in the disaggregated data for Motion and Forces trigger significant questions about the effectiveness of the unit for all groups of students. From an equity perspective, these data reveal important outcome patterns that are worth further exploration to develop a better understanding of how all groups of students learn with these materials. Perhaps, ethnographic data that explores the enactment of Motion and Forces in diverse classrooms could have important insights into how some groups of students are better able to learn the target ideas than other groups of students. It is important to know if and how the intervention is helpful to some students—this may affect considerations about adoption of curriculum materials, revisions of curriculum rating systems; and, suggest better methods to more closely examine how and why this unit works for some groups, but not for others. Turning to the experience effect of the curriculum materials, Year 3 posttest means for student learning were not as high as they were in Year 2. These results are inconsistent with what other researchers have found regarding the effect of experience with science curriculum materials on student outcomes (Fogleman & McNeill, 2005). Given our understanding of teaching and learning in science classrooms, it seemed reasonable to expect that as teachers became more familiar or comfortable with using a set of curriculum materials as a result of experience that a noticeable gain in student learning across the years

Motion and Forces Curriculum Unit 21 of the study might be observed. This was not the case. Perhaps, with a unit such as Motion and Forces, much more time is needed in the form of subsequent implementations before teachers are able to realize the benefits of experience. For example, is it more reasonable to expect a meaningful gain in student learning as teachers move from their fifth year of implementation to their sixth year as opposed to moving from their first to their second implementation? Because of the logistical and procedural demands of getting through the Motion and Forces explorations, are teachers in early years (1-2 years) unable to focus as much on student learning as they could in subsequent years? These kinds of questions remain unanswered regarding Motion and Forces and in the science education research community in general. In terms of the motivation and engagement outcomes it is a bit surprising, given the achievement goal dependence on classroom context, that goal orientation outcomes do not show any difference across Treatment and Comparison groups. It appears that subgroups that report the highest levels of mastery goal orientation are those that are the lowest scoring on the MFA. This is inconsistent with previous research suggesting a stronger connection between these kinds of achievement goals and learning (Ames & Ames, 1984; Pintrich & Schunk, 1996). Perhaps, research exploring motivation and engagement patterns in groups of students that are deemed “ready” for conceptual change might be a reasonable next step. One trend that was consistent across both Years 2 and 3 was the utility of basic engagement in predicting concept learning. Prior research (c.f., Pintrich, Marx, & Boyle, 1993) has suggested that advanced learning engagement and mastery goal orientation are often more closely related to learning outcomes than was observed here. Perhaps, materials like Motion and Forces that require students to assemble lab equipment, conduct numerous

Motion and Forces Curriculum Unit 22 trials and experiments, collect large amounts of data and respond to many questions in writing, it is sufficient for students to simply remain engaged at a more basic level. More analyses are perhaps needed to begin to unpack the complicated relationships between motivation, engagement, and learning in middle school science classrooms using curriculum materials such as Motion and Forces. While the Year 2 data show some positive results for the effectiveness of the Motion and Forces curriculum materials, it is interesting that the evidence for overall effectiveness was not present in both years of the study. These results have led SCALE-uP to yet a third implementation of Motion and Forces. The third implementation (Year 4) will be similar to Year 2 in that groups of schools have been selected to implement the materials that have had no prior experience with the intervention. That is, schools will be considered first year implementers in Year 4 much in the same way schools were considered first year implementers in Year 2 of this study. It is our hope that findings from these replication studies can serve the public interest by further contributing to the accumulating knowledge base on the effectiveness of these curriculum materials. In many ways, this third implementation represents an additional replication of our study given that the same methods and designs are being used with additional research participants (i.e., teachers and schools). We hope that an additional and “tighter” implementation study can help us decide if and how Motion and Forces should be scaled up to more students

Motion and Forces Curriculum Unit 23 References Adelman, H.S. & Taylor, L. (1997). Toward a scale-up model for replicating new approaches to schooling. Journal of Educational and Psychological Consultation, 8(2), 197-230. American Association for the Advancement of Science (AAAS). (1993). Benchmarks for science literacy. New York: Oxford University Press. AAAS. (2003). Project 2061 analysis of science and mathematics assessment tasks. Manuscript in preparation. Ames, C., & Ames, R. (1984). Goal structures and motivation. The Elementary School Journal, 85(1), 39-52. Ball, D.L. & Cohen, D.K. 1996. Reform by the book: What is – or might be – the role of curriculum materials in teacher learning and instructional reform. Educational Researcher, 25, p. 6-8, 14. Bradach, J. (2003). Going to scale: The challenge of replicating social programs. Stanford Social Innovation Review, 1, 18-25. Champagne, A., Gunstone, R., & Klopfer, L. (1985). Effecting changes in cognitive structures among physics students. In L. West & A. Pines (Eds.), Cognitive structure and conceptual change (pp. 61-90). Orlando, FL: Academic Press.Chi and Roscoe (2002). Chatterji, M. (2004). Evidence on “what works”: An argument for extended-term mixedmethod (ETMM) Evaluation Designs. Educational Researcher, 33(9), 3-13. Fogleman, J. & McNeill, K.L. (2005). Comparing teachers’ adaptations of an inquiry-

Motion and Forces Curriculum Unit 24 oriented curriculum unit with student learning. Paper presented at the Annual Meeting of the American Educational Research Association. April, Montreal, Canada. Geier. R.R. (2005). Student achievement outcomes in a scaling urban standards-based science reform. Unpublished doctoral dissertation, University of Michigan, MI. Goodlad, J. (1984). A place called school. Prospects for the future. New York: McGrawHill. Gunstone, R. & Watts, D. M. (1985). Force and motion. In R. Driver, E. Guesne & A. Tiberghien (Eds.), Children’s ideas in science (pp. 85-104). Philadelphia: Open University Press. Harvard-Smithsonian Center for Astrophysics. (2001). ARIES—Exploring Motion and Forces: Speed, Acceleration, and Friction. Watertown, MA: Charlesbridge Publishing. Hewson, P. (1981). A conceptual change approach to learning science. European Journal of Science Education, 3, 383-396. Hewson, P.W., & Hewson, M.G. (1984). The role of conceptual conflict in conceptual change and the design of science instruction. Instructional Science, 13, 1-13. Hubbard, B.M., Giese, M.L., & Rainey, J. (1998). A replication study of Reducing the Risk, a theory-based sexuality curriculum for adolescents. The Journal of School Health, 68(6), 243-247. Kesidou, S., & Roseman, J.E. (2002). How well do middle school science programs measure up? Findings from Project 2061’s curriculum review. Journal of Research in Science Teaching, 39(6), p. 522-549. Linnenbrink, E. A. & Pintrich, P. R. (2003). Achievement goals and intentional

Motion and Forces Curriculum Unit 25 conceptual change. In G. M. Sinatra & P. R (Eds.), Intentional conceptual change. Mahwah, NJ.Lawrence Erlbaum Associates, 347-374. Marks, H. (2000). Student engagement in instructional activity: patterns in the elementary and middle school years. American Educational Research Journal, 37 (1), 153-184. Midgley, C., Maehr, M., Hurda, L. (2000). Manual for the Patterns of Adaptive Learning Scales. Ann Arbor, MI: University of Michigan. Minstrell, J. (1989). Teaching science for understanding. In L. Resnick & L. Klopfer (Eds.), Toward the thinking curriculum: Current cognitive research (pp. 129149). Alexandria, VA: Association for Supervision and Curriculum Development. Mosteller, F., & Boruch, R. (Eds.). (2002). Evidence matters: Randomized trials in education research. Washington, DC: The Brookings Institute. National Research Council. (2005). Advancing scientific research in education. Committee on Research in Education. Lisa Towne, Lauress L. Wise, and Tina M. Winters. Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press. Ochsendorf, R. (2003). Using the Project 2061 Curriculum Analysis to rate a middle school science curriculum unit: ARIES: Exploring Forces and Motion. George Washington University/Montgomery County Public Schools SCALE-uP. Author (manuscript in preparation). Evaluating a science curriculum unit: Learning through the process. Manuscript in preparation. Pintrich, P. R., Marx, R. W., & Boyle, R.B. (1993). Beyond cold conceptual change: The role of motivational beliefs and classroom contextual factors in the process of conceptual change. Review of Educational Research, 63, 167-199.

Motion and Forces Curriculum Unit 26 Pintrich, P.R., & Schunk, D.H. (1996). Motivation in education: Theory, research, and applications. Englewood Cliffs, NJ: Prentice Hall, Inc. Posner, G. J., Strike, K. A., Hewson, P. W.,& Gertzog, W. A. (1982). Accommodation of Scientific conception: Toward a theory of conceptual change. Science Education, 66, 211-217. Roseman, J., Kesidou, S., & Stern, L. (1996). Identifying curriculum materials for science literacy: A Project 2061 evaluation tool. Paper prepared for the Colloquium “Using the National Science Education Standards to Guide the Evaluation, Selection, and Adaptation of Instructional Materials.” National Research Council, November 1996. Senk, S.L. & Thompson, D.R. (2003). Standards based school mathematics curricula: What are they? What do students learn? Mahwah, NJ: Lawrence Erlbaum Associates. Strike, K.A., & Posner, G. J. (1992). A revisionist theory of conceptual change. In R. A. Duschl and R. J. Hamilton (Eds.) Philosophy of science, cognitive psychology, and educational theory and practice (pp. 147-176). New York: State University of New York Press.

Motion and Forces Curriculum Unit 27

Motion and Forces Curriculum Unit 28

Figure 2. Adjusted mean scores for motions and forces understanding between the comparison and treatment conditions. Means adjusted by the pretest covariate, pre-test mean = 41.52 (Year 2). Note: Levels of understanding: 20-50 = Some understanding of concepts; 51-70 = Some fluency with concepts; 71 – 100 = Flexible understanding of concepts

Motion and Forces Curriculum Unit 29

Table 1. Mean scores for concept assessment, learning engagement, and goal orientation by curriculum condition in Years 2 and .

Year 2

Year 3

Comparison Variable

M

SD

n

MFA (Concept Assessment) Pre

42.15

(20.51)

1100

Post

55.16

(22.64)

Learning Engagement Basic Pre

3.70

Post Advanced

Goal Orientation Mastery

Perf. Approach

Test

Treatment M

Comparison

Treatment

SD

n

M

SD

n

40.93

(17.85)

1185

37.74

(17.59)

1204

1040

56.58

(22.18)

1132

52.38

(21.68)

(0.78)

949

3.61

(0.75)

1113

4.10

3.71

(0.81)

835

3.73

(0.78)

1056

Pre

3.17

(0.93)

949

3.04

(0.88)

Post

3.14

(0.96)

832

3.04

Pre

3.60

(0.90)

949

Post

3.42

(0.93)

Pre

2.49

Post

2.45

M

SD

n

38.37

(17.06)

1052

1203

51.35

(22.54)

1051

(0.73)

1103

4.11

(0.69)

965

4.10

(0.72)

1022

4.18

(0.68)

878

1113

2.97

(0.88)

1095

2.96

(0.90)

954

(0.97)

1043

2.98

(0.96)

1008

2.94

(0.99)

865

3.52

(0.90)

1112

3.62

(0.85)

1101

3.59

(0.82)

956

840

3.39

(0.99)

1048

3.48

(0.93)

1011

3.48

(0.95)

867

(1.02)

943

2.44

(0.96)

1113

2.04

(0.93)

1089

2.07

(0.87)

959

(1.05)

845

2.44

(0.99)

1057

2.05

(0.96)

1009

2.11

(0.96)

872

Note. The reported n is the number of students that completed all tests: pretest and posttest and in each subgroup. Individual scores are derived from an unweighted average of Likert items for each scale from the engagement and goal orientation questionnaire. 1 = Not true through 5 = Very true.

Motion and Forces Curriculum Unit 30

Figure 3. Adjusted mean scores for motions and forces understanding by ethnicity and curriculum condition. Means adjusted by the pre-test covariate, pretest mean = 41.52 (Year 2).

Figure 4. Adjusted mean scores for motions and forces understanding by Free and Reduced-Price Meal Status and curriculum condition. Means adjusted by the pretest covariate, pre-test mean = 41.52 (Year 2).

Motion and Forces Curriculum Unit 31

Adjusted Means for MFA Weighted Posttest Score Ethnicity

Estimated Marginal Means

75.00 70.00 _________________________________________ 65.00 60.00

Asian American African American White Hispanic

55.00 50.00 _________________________________________

45.00 40.00 Comparison

Treatment

Experimental Condition

Figure 5. Adjusted mean scores for motions and forces understanding by ethnicity and curriculum condition in Year 3. Means adjusted by the pre-test covariate, pre-test mean = 38.04.

Adjusted Means for MFA Weighted Posttest Score Free and Reduced Meals 70.00 _________________________________________ Status

Adjusted Means

75.00

Never FARMS Prior FARMS Now FARMS

65.00 60.00 55.00 50.00

_________________________________________

45.00 40.00 Comparison

Treatment

Experimental Condition

Figure 6. Adjusted mean scores for motions and forces understanding by Free and Reduced-Price Meal Status and curriculum condition in Year 3. Means adjusted by the pre-test covariate, pre-test mean = 38.04.

Motion and Forces Curriculum Unit 32 Table 2. Curriculum effect sizes for all levels of Independent Variables for MFA in Year 2 and Year 3. Year 2 n

Year 3 d

n

d

Variable Overall

2169

0.10

2365

-0.06

Gender Male Female

1108 1064

0.12 0.08

1165 1191

-0.12 0.07

486 354 394 938

-0.04 0.17 -0.10 0.21

470 378 455 1053

-0.34 0.21 -0.18 0.03

FARMS Never Prior Now

1349 248 575

0.23 -0.08 -0.10

1492 259 605

0.09 -0.20 -0.21

ESOL Never Prior Now

1717 309 146

0.12 -0.16 0.26

1836 369 151

-0.02 -0.02 -0.15

SPED No Now

1977 195

0.13 -0.20

2070 286

-0.03 -0.02

Ethnicity African American Asian American Hispanic White

Note. d = Cohen’s d effect size.