Effects of Numbered Heads Together on the Daily Quiz Scores and On ...

5 downloads 662 Views 267KB Size Report
Aug 14, 2010 - Abstract. Previous research has demonstrated that Numbered Heads Together, a cooperative learning strategy, is more effective than ...
J Behav Educ (2010) 19:222–238 DOI 10.1007/s10864-010-9108-3 ORIGINAL PAPER

Effects of Numbered Heads Together on the Daily Quiz Scores and On-Task Behavior of Students with Disabilities Todd Haydon • Lawrence Maheady • William Hunter

Published online: 14 August 2010  Springer Science+Business Media, LLC 2010

Abstract Previous research has demonstrated that Numbered Heads Together, a cooperative learning strategy, is more effective than traditional teacher-led instruction in academic areas such as social studies and science. The current study compared the effects of two types of Numbered Heads Together strategies with a baseline condition during 7th grade language arts lessons. Results indicated that three students with various disabilities had higher percent intervals of on-task behavior and daily quiz scores during either Heads Together condition. Teacher satisfaction ratings suggested that Heads Together was easy to implement, and all three students preferred this strategy to baseline instruction. A discussion of study limitations, implications, and future research directions is included. Keywords Numbered Heads Together  Peer-assisted teaching  Cooperative learning  Teacher questions  Language arts

Students with mild to moderate learning and behavior challenges do not fare as well in school as their typical achieving peers (Lerner and Johns 2009). Students with emotional behavioral disorders (EBD) for example tend to have poor outcomes such

T. Haydon (&)  W. Hunter CECH University of Cincinnati, Teachers/Dyer Hall, Cincinnati, OH 0022, USA e-mail: [email protected] W. Hunter e-mail: [email protected] L. Maheady SUNY Fredonia, E268 Thompson Hall, Fredonia, NY 14063, USA e-mail: [email protected]

123

J Behav Educ (2010) 19:222–238

223

as high rates of absenteeism, low grade point averages, low graduation rates (41.9%), course failure, and undesirable levels of school drop out (US Department of Education 2001). Researchers have shown that these students perform 1.2–2 grade levels behind their typical achieving peers in elementary school and that the discrepancy widens to 3.5 years for those who make it to high school (Coutinho 1986; Ryan et al. 2004). Similarly, students with attention-deficit/hyperactivity disorder (ADHD) have academic deficiencies and are at-risk for academic failure (Gunter and Denny 1998; Trout et al. 2007), whereas those with mild intellectual disability are characterized as having slow learning rates, delayed cognitive development, and difficulty identifying significant task features. Students with mild intellectual disability also tend to be passive learners, who display slow acquisition rates for new knowledge and skills (Tucker et al. 1998). As a result, some general education teachers may feel ill-prepared to meet the instructional needs of these students (Mortweet et al. 1999). Previous research indicates that researchers, teachers, and administrators believe that hyperactive, inappropriate, and disruptive behaviors of students with varied disabilities must be controlled before they can acquire and master academic skills (Ryan et al. 2004; Trout et al. 2007). As such, much research, particularly involving students with EBD has focused on decreasing inappropriate behaviors with little attention given to improving their educational needs (Wehby et al. 2003). However, since the recent academic requirements mandated by the No Child Left Behind Act (2002), researchers have started to concentrate on identifying effective instructional strategies that may improve the academic performance of students with various disabilities (Mooney et al. 2003; Wanzek and Vaughn 2009). One such class of educational interventions is peer-mediated instruction (e.g., peer tutoring, peer-assisted learning strategies [PALS], and cooperative learning), in which peers, rather than adults, mediate the instructional process by presenting educational content, providing immediate positive and corrective feedback, and monitoring another’s performance (Maheady et al. 2006; Slavin 1995). In a literature review, Ryan et al. (2004) reported that peer-mediated instruction produced large effect sizes (greater than 0.8) to support their use in improving academic outcomes for students with EBD. The authors noted further that such positive changes occurred across all content areas, including reading, English, math, science, and history. In addition, peer-mediated strategies received consistently positive consumer satisfaction ratings from teachers and pupils. Cooperative learning is one type of peer-mediated instruction that involves small, heterogeneous groups of students working together in a non-competitive manner to maximize their own and others’ learning (Putnam 1998). There are five essential components to make cooperative learning groups successful (Johnson and Johnson 1999; Johnson et al. 1991; Putnam 1998). • • •

Positive interdependence; all group members are concerned about each other’s performance toward achieving a group goal. Individual accountability; each student is responsible for learning academic content and contributing to the group. Face to face positive interaction; while working, students directly interact with one another.

123

224

• •

J Behav Educ (2010) 19:222–238

Social skills; skills will vary according to age but commonly involve enhancing cooperation. Group processing; students evaluate whether group goals were achieved and whether there was an equal opportunity for responding.

Cooperative learning strategies are effective because they increase pupil response opportunities, provide more immediate and frequent feedback, increase the number of complete learning trials, and offer students the opportunities to serve as teachers and learners. In their teacher roles, for example, students must elaborate and explain academic material to peers and provide feedback on their performance. As learners, they are held accountable for their contributions, they must verbalize what they have learned, they are given numerous opportunities to actively participate (i.e., answering and asking questions), and they hear a variety of peer explanations for solving problems. Cooperative learning strategies often improve student understanding and retention of learned content. Moreover, as students collaboratively solve problems, there is an increased likelihood of peer acceptance (Emmer et al. 1979; Slavin 1995; Sutherland et al. 2000). Although the potential benefits of cooperative learning groups have been documented in the research literature, these groups are uncommon in typical classroom settings. Rather, the most common mode of classroom instruction continues to be teacher-led instruction (i.e., lecturing) and calling on students to volunteer responses (Hayling et al. 2008; Kagan 1992). One concern about using traditional Whole Group Question and Answer (WGQ&A) arrangements is that most students become passive observers while a few, usually higher-achieving pupils volunteer to answer teacher questions (Maheady et al. 1991, 2002). This is problematic because teachers often use pupil responses to determine what to do next instructionally (e.g., re-teach and/or introduce new content) (Barbetta and Heward 1993; Barbetta et al. 1993). If these decisions are made primarily on the basis of higher-performing student responses, then other lower achieving peers may fall further behind instructionally. One method that may increase class participation among all students is Numbered Heads Together (NHT). NHT is a cooperative learning strategy in which teachers (a) assign students to small (4 member), heterogeneous learning groups, (b) ask them to number themselves from 1 to 4, (c) direct questions to the entire class, and (d) tell students to put their heads together, come up with the best answers they can, and make sure that everyone on the team knows the answers (Kagan 1992). Groups are then given 20–30 s to discuss questions and to make sure that all team members understand the answers. Teachers then randomly select a number (i.e., from 1 to 4) and ask those ‘‘numbered’’ students to raise their hands to respond. Teachers pick one student, ask other similarly numbered students whether they agree or disagree with the answer, and then provide feedback to the entire class on their response accuracy. Researchers have shown that NHT can increase the number and variety of students who respond to teacher-led questions and can improve their performance on curriculum-based academic quizzes over the baseline condition (Maheady et al. 1991, 2002). When teachers use NHT, lower performing students participate actively in class, and their higher-achieving peers continue to discuss academic content (Kagan 1992; Maheady et al. 2002). As such, Numbered Heads Together plus the

123

J Behav Educ (2010) 19:222–238

225

incentives condition (NHT ?I) may provide an effective and appealing instructional alternative to traditional, hand-raising responses to teacher-led questions. In the first known NHT study, Maheady et al. (1991) utilized an alternating treatments design and compared NHT (with added incentive components) with a WGQ&A strategy. The authors found that using NHT, 3rd graders’ class averages on weekly social studies quizzes increased by approximately two-letter grades (i.e., WGQ&A, M = 68.5% vs. NHT, M = 84.3%). In a follow-up investigation, Maheady et al. (2002) compared the effects of NHT and response cards (RC) with WGQ&A instruction. Results indicated that 21, 6th grade students’ science quiz scores improved by about one-letter grade under NHT and RC conditions rather than WGQ&A conditions (i.e., WGQ&A, M = 73.2; NHT, M = 81.6; RC, M = 81.5). Unlike the initial investigation, however, this study did not include a behavioral incentive package (i.e., public posting of quiz scores, between-team competition, and contingent rewards) as part of the NHT intervention. In a third related study, Maheady et al. (2006) examined the role that the behavioral incentive package played in pupil performance while using NHT. The effects of NHT with and without an incentive package were compared in a culturally and linguistically diverse 6th grade science class that included 23 students. Pupil performance was compared on both formative (i.e., daily, 10-item chemistry quizzes) and summative (pre- and posttests) achievement measures, and consumer satisfaction ratings were derived for baseline and both intervention conditions. Results indicated a noticeable difference in mean percentage of correct responses on daily quiz scores in favor of NHT over baseline condition (WGQ&A). Further increases were noted when NHT was used with the behavioral incentive package (i.e., WGQ&A, M = 72.4 vs. NHT, M = 80.3, vs. NHT ?I, M = 89.2), and students indicated a preference for NHT plus incentives condition. The present study replicates the Maheady et al. (2006) study and extends the previous literature by (a) using a different setting, a 7th grade self-contained special education classroom; (b) working with students identified with various disabilities; (c) further investigating the use of incentives with and without NHT, and (d) using a new content domain, language arts instruction. The purpose of this study, therefore, was to examine the effects of NHT with and without an incentive package (NHT ?I) on the on-task behavior and daily quiz scores of students with special needs during a language arts activity in a middle school self-contained resource room. From a social validity standpoint, one question was asked: which strategy did the teacher prefer?

Method Participants and Setting Teacher The classroom teacher was an African American male, age 33, with 10 years of teaching experience. He held state certification in mild/moderate learning

123

226

J Behav Educ (2010) 19:222–238

Table 1 Participant characteristics Name

Gender/Age

Ethnicity

Disability

IQ score

Carmen

Female/13 years 5 m

AA

MID

70

Nate

Male/13 years 7 m

AA

OHI

94

RJ

Male/13 years 3 m

AA

EBD

93

AA African American, W white, MID mild intellectual disability, OHI other health impairment, EBD emotional/behavioral disorder

disabilities (K-12). During the study, there was a Caucasian female paraeducator, approximately 50 years old in the room. Students Three target students (names are pseudonyms) participated in the study. Informed consent was sought for all eight students in the class and the first three students to return signed forms were included in the study (the other students did not return the forms). Table 1 reports information on each participant’s gender, ethnicity, age, and IQ score. The three students’ disabilities were identified by school personnel using state-defined criteria (Kentucky Department of Education 2008), and their IQ scores were determined using the Kaufman Brief Intelligence Test (K-BIT-2). Carmen was identified with mild intellectual disability; her intelligence scores were two SD’s below the mean, and she had deficits in her overall academic performance including acquisition, retention, and application of content-related knowledge, and impaired adaptive behavior. Nate was classified with EBD; he had deficits in social behavior, which impeded his ability to maintain satisfactory interpersonal relationships with adults or peers. He was working at a 5th grade skill level in reading and had academic deficiencies particularly with fluency, word recognition, and comprehension. RJ was identified with Other Health Impairments (OHI) and had ADHD. He was easily distracted, often off topic and off-task during instruction, which affected his grades. He was working at a 2nd grade skill level in reading and also had academic deficiencies in reading with fluency, word recognition, and comprehension. Each student received additional individualized research-based reading interventions outside of the classroom for 1 h a day. Setting The study was conducted in a 6th to 8th grade middle school located in a Midwestern city. Approximately 92% of the students received reduced-price lunches. The racial/ethnic make-up of the school was approximately 60% Caucasian, 32% African American, 2% American Indian, and 6% unspecified. The study took place in a 7th grade, self-contained resource room containing eight students with varied disabilities including six students with EBD, one student with mild intellectual disability, and one student with OHI. The classroom was arranged with four rows of four individual seats and included a large dry erase board in the front of the room and a smart board on the right side

123

J Behav Educ (2010) 19:222–238

227

that was used for daily interactive lessons. The teacher used a rectangular desk in the back of the room to observe students during independent seatwork. The teacher and paraprofessional’s desks were located in the rear right and left corners of the classroom. There was also a student computer station (four computers) on the left side of the classroom, and both the teacher and paraprofessional were in the classroom during the study. The study took place during whole group teacher-led language arts instruction and at the same time each afternoon. The planning and lesson structure for all sessions was consistent throughout the study. The teacher used part of a 30-min planning period to develop three questions with the purpose of relating studentexperiences to lesson objectives and to develop 10-item quizzes. Materials The teacher selected reading passages from the book Document-based Questions for Reading Comprehension and Critical Thinking (Housel 2007) and ensured that lesson standards were aligned with the state of Kentucky Middle School Core Content Standards (i.e., formulate questions, identify key information, summarize answers, etc.). The book consisted of high-interest stories, and the purpose was to prepare students to read non-fiction text. The objective of the lessons was to understand literature and to provide practice opportunities toward acquisition of the language arts material. A language arts specialist in the middle school evaluated the materials to verify that the topic, lesson plans, and objectives were appropriate for a 6th grade skill level. No recommendations (i.e., additions, deletions, rewording, etc.) were made to the classroom teacher. A university professor with expertise in reading comprehension independently reviewed the passages, and using a Flesch– Kincaid scoring procedure indicated that the reading passages were at an average of a 6.2 (range = 6.0–6.4) grade level. In order to ensure that quiz difficulty was equivalent across conditions, the teacher developed questions and all quizzes consisted of a combination of five multiple-choice questions, three true or false questions, and two fill-in-the-blank short answers. Topics covered included stories of (a) famous people (Mahatma Ghandi, Amelia Earhardt, and Thomas Edison) and (b) well-known historical events (e.g., battles, Gettysburg; disasters, Hindenburg, Titanic). Dependent Measures On-Task Behavior On-task behavior was defined in a similar manner to Nelson et al. (1996) and included eyes and/or marker on the required assignment; eyes on peers discussing the material; or eyes on the teacher when instructions, directions, and feedback were given. On-task behavior also included comments related to material being covered in class. Data on student on-task behavior were collected using a 10-s momentary time sampling recording system during 15-min observation sessions.

123

228

J Behav Educ (2010) 19:222–238

Quiz Scores The academic outcome was the percentage of correct responses on daily 10-item quizzes administered at the end of each language arts activity. Data Collection and Reliability To provide evidence that the measures of on-task behavior were reliable, 5-first year doctoral students served as observers and were trained by the first author using procedures outlined by Kennedy (2005). Observers were provided with a three-page document that included operational definitions and examples and non-examples of each dependent variable. After reading the manual, observers participated in a 1-h training session. Here, they practiced coding all behaviors while watching video clips of teachers implementing NHT during large group instruction. Data collection began as soon as interrater reliability levels of at least 90% were achieved for three consecutive trials. Data collectors were aware of which condition was being observed (i.e., BL, NHT, or NHT ?I), but they were unaware of the relative effectiveness of any condition on the dependent variables. Reliability Checks on Dependent Measures To ensure accuracy of quiz scores, the first author and the classroom teacher independently scored all 10-item quizzes. To control for observer drift, the primary observer met with the secondary observers on a weekly basis and/or repeated the training exercises once every 5 sessions (Cooper et al. 2007). Interobserver Agreement On-task behavior was calculated using an interval agreement formula, dividing the total number of agreements by the total number of agreements and disagreements and multiplying by 100%. The unit of measurement was percentage of 10-s intervals. Quiz scores were calculated using the total agreement formula, summing the total number of correct responses and dividing the smaller total by the larger total and multiplying by 100%. The unit of measurement was percentage correct. Interobserver agreement was calculated during 57.1% of observations during the study. Mean inter-observer agreement for on-task behavior across all study conditions was 94.1, 92.9, and 92.6% (range 89.6–100%) for Carmen, Nate, and RJ, respectively. Average interobserver agreement for quiz scores was 100%. Experimental Design and Conditions An alternating treatments design (Barlow and Hayes 1979) was used to compare the effectiveness of the three instructional strategies (BL, NHT, NHT ?I). Visual inspection was used as an analytic procedure (Kazdin 1982). Differential treatment effects were determined mainly by noting distinct separation of data points as well as determining the mean difference among conditions. Data trends were used to

123

J Behav Educ (2010) 19:222–238

229

point out differences between NHT and NHT ?I conditions. The measure of percentage of non-overlapping data points (PND) was also utilized (Campbell 2004). The lesson structure for all study conditions included the following: (a) initial review of classroom rules and previous content, (b) explicit statements of lesson goals, (c) teacher lecture (i.e., writing, demonstrating, and drawing diagrams on white board) and (d) teacher questioning to assess pupil understanding. According to the teacher’s instructional routine, students were allowed 2 min to review quiz questions and then approximately 15 min to read selected passages. Based on a randomized schedule, the teacher was instructed to implement either (a) BL, (b) NHT, or (c) NHT ?I. An entire class session typically lasted 50 min, however, the NHT interventions were used only during the last 15 min of each intervention session. The study was conducted over a period of 22 days. Baseline (Hand-raising) Baseline conditions are described using guidelines provided by Lane et al. (2007). First, the teacher restated classroom rules and asked three questions to assess prior knowledge on current topics. Next, students read silently their reading passages. The teacher then asked 10 questions to the entire class, and students volunteered responses by raising their hands to respond. The teacher then selected volunteers at random. The only change to the normal instructional routine was that 10-item quizzes were administered at the end of language arts lessons. Consistent with pupils’ individualized education programs (IEPs), quiz items were read aloud by the teacher, and students wrote their individual responses. Numbered Heads Together (NHT) During the standard NHT condition, the teacher read a script (i.e., provided by first author) to students that covered expected behaviors during NHT. Classroom rules included the following: (a) one person talks at a time, (b) respect everyone’s answers, (c) use ‘‘indoor voices’’ when talking, (d) remain quiet when teacher speaks, (e) majority vote counts as your answer, and (f) quietly return to your seats to take quizzes. The teacher also prompted students to chorally respond and repeat the condition (NHT) that was in effect at that time. NHT implementation procedures were similar to Maheady et al. (2006). Students were first assigned to small, heterogeneous groups according to their current academic ranks in class (see, for example, Slavin 1995). Two heterogeneous teams were formed by assigning high, average, and low performers to each team. Students within teams were then assigned numbers 1–4 to designate who would respond to questions on each team. Team membership remained the same throughout the investigation. After students silently read the reading passage, the teacher directed questions to the entire class but said ‘‘put your heads together, come up with the best answer you can, and make sure that everyone on your team knows the answer’’. The teacher waited approximately 20 s for team members to discuss questions and have one

123

230

J Behav Educ (2010) 19:222–238

team member write their answers on a white board. The teacher then randomly selected a number from 1 to 4 and asked students (one in each group) to raise their boards and show their answers. The teacher then asked whether everyone in each group agreed with answers and then provided positive and corrective feedback as needed. At the end of each session, students independently took 10-item quizzes following baseline procedures. Quiz questions were read aloud, and students were allowed 10 min to write their responses. Quizzes were scored and returned the next day. Numbered Heads Together Plus Incentives (NHT ?I) During the NHT plus incentive sessions, the teacher followed the same procedures outlined in NHT condition (read from a script, prompted students to chorally respond to condition, directed pupils to write responses on white board, and checked others for agreement) with one exception. Now, the teacher provided contingent rewards. Immediately after the NHT ?I activity, students received blue raffle tickets (i.e., tokens) contingent on their performance during language arts. The teacher provided verbal feedback (‘‘Carmen, you are earning your reward’’) every 5 min for exhibiting expected behavior (i.e., on-task behavior, writing on response cards, answering questions, etc.). Student progress was recorded through a progressmonitoring sheet. The students (including those not in the study) had to get three out of three tally marks in order to receive a ticket and exchange it for a reward. Rewards included having a 15-min independent activity time (i.e., use of computers, playing board games), healthy snacks, and educational items. All three students received the rewards during all NHT ?I conditions. During the study, a classroom-level system was in place with separate rewards from the NHT ?I condition. The system consisted of four levels based on percentage of appropriate student overall performance during the week (i.e., 67–77, 78–87, 88% or higher). Students on the fourth level were considered to have appropriate behavior for transition to the general education classroom. At the time of the study, the three target students were on the second or third level. Students were given their rewards during the morning or lunch, prior to the afternoon language arts activity. Furthermore, no students met the criteria for earning a reward on days when the NHT only condition was in effect. Treatment Adherence Three different procedural checklists were developed to measure the accuracy with which the teacher used the three interventions (i.e., BL, NHT, and NHT ?I). Independent observers checked each step as either present or absent over the course of the investigation. Checks for the BL condition consisted of six items. The six items included reviewing classroom rules, allowing 2 min to read quiz questions, allowing 10 min to read passages, asking 10 knowledge-based questions, randomly calling on students to answer questions, and reading quiz questions. In contrast, NHT and NHT ?I procedural checklists consisted of 13 and 15 steps, respectively. The seven additional steps for the NHT condition included reviewing rules for NHT,

123

J Behav Educ (2010) 19:222–238

231

prompting students to chorally respond regarding which treatment was in effect, assigning students to heterogeneous groups, asking students to ‘‘put their heads together’’, prompting students to raise response cards and show answers, asking which students agree or disagree with the answer, and prompting students to go back quietly to their individual seats and take the quiz. The NHT ?I checklist contained two additional steps, providing feedback every 5 min and immediately providing rewards (i.e., computers, receiving a healthy snack, etc.). Treatment adherence was calculated by dividing the number of steps present by the total number of steps and then multiplying by 100%. Finally, checks were implemented during the follow-up session. Treatment integrity data indicated that the teacher implemented all procedural steps for BL, NHT, and NHT ?I with 100% adherence on all occasions. Social Validity At the completion of the study, the teacher and students were asked to complete social validity surveys regarding the acceptability and usefulness of each intervention strategy (Haydon et al. 2010). The surveys were completed independently and anonymously. The teacher survey, based on prior use, included nine questions and used a 4-point Likert-type scale, where 1 represented not at all and 4 represented very much. The rating scale consisted of three categories: (a) ease of implementation, (b) intervention effectiveness, and (c) likelihood of future intervention use. Students also rated nine questions using a 4-point Likert-type scale, consisting of three categories of perceived effects of each intervention on their (a) social, (b) academic, and (c) peer interaction behaviors.

Results On-Task Behavior Table 2 summarizes means and ranges for the percentage of intervals of on-task behavior across students and experimental conditions. All three students demonstrated lower mean percentages of on-task behavior during BL when compared to both NHT and NHT ?I conditions. However, all three students demonstrated similar percentages of on-task behavior in NHT ?I versus NHT conditions. As Fig. 1 shows, Carmen, Nate and RJ displayed the lowest levels of on-task behavior during BL conditions. In contrast, all three students showed their highest levels of on-task behavior while NHT and NHT ?I were in effect. Further visual analyses revealed that (a) for Carmen, seven of eight (87.5%) NHT data points exceeded the highest BL data point, while six of seven (85.7%) NHT ?I data points exceeded the highest BL data point (b) for Nate, five of eight (62.5%) NHT data points exceeded the highest BL data point and three out of eight (37.5%) NHT ?I data points exceeded the highest BL data point, and (c) for RJ, seven of seven (100%) data points exceeded the highest BL data point during both NHT and NHT ?I conditions. For all three students, no clear differences in PND could be determined between NHT and NHT ?I conditions.

123

123

71.1 (50.0–95.8)

82.7 (44.4–95.8)

36.0 (27.7–52.1)

63.3

Nate

RJ

Mean

41.9

40.0 (20.0–50.0)

52.9 (10.0–80.0)

32.9 (0.0–50.0)

97.0

95.5 (82.3–100)

96.2 (81.3–100)

99.2 (93.3–100)

On-task M (Range)

On-task M (Range)

Quiz M (Range)

NHT

Baseline

Carmen

Student

Table 2 Means and ranges, for on-task behavior and quiz scores in each condition

61.4

48.6 (30.0–70.0)

74.3 (60.0–90.0)

61.4 (40.0–70.0)

Quiz M (Range)

96.6

96.5 (80.0–100)

94.1 (77.7–100)

99.3 (95.0–100)

On-task M (Range)

NHT ?I

61.9

68.0 (50.0–80.0)

75.9 (70.0–90.0)

41.7 (30.0–50.0)

Quiz M (Range)

232 J Behav Educ (2010) 19:222–238

J Behav Educ (2010) 19:222–238

233

Fig. 1 Participants’ percentage correct on quiz scores and intervals on-task during each condition of the study

Quiz Scores Table 2 also summarizes the means and ranges relevant to quiz scores across students and conditions. All three students demonstrated their lowest quiz scores during BL and their highest averages when NHT and/or NHT ?I were in effect. Quiz scores were less clear, however, between NHT and NHT ?I conditions. Nate and RJ demonstrated higher quiz scores during NHT ?I (Nate NHT ?I M = 76% & NHT M = 74%; RJ NHT ?I M = 68% & NHT M = 49%), while Carmen earned higher quiz scores during NHT rather than NHT ?I (NHT M = 61% vs. NHT M = 42%). Figure 1 shows mean quiz scores of participants across experimental conditions. Quiz scores were typically highest during NHT ?I for Nate and RJ, and during NHT condition for Carmen. The least amount of variability was evident during NHT for Carmen and BL for RJ. Data showed upward trends for Nate during BL and during NHT ?I for RJ. Further visual analyses revealed that (a) for Carmen seven of eight (87.5%) NHT data points exceeded the highest BL data point, while no NHT ?I data points exceeded the highest BL data point, and there were seven of eight

123

234

J Behav Educ (2010) 19:222–238

(87.5%) non-overlapping data points between NHT and NHT ?I conditions (b) for Nate, three out of eight (37.5%) NHT data points exceeded the highest BL data point and four out of eight (50%) NHT ?I data points exceeded the highest BL data point, while two out of seven (28.5%) NHT ?I data points exceeded the highest NHT data point, and (c) for RJ two out of seven (28.5%) NHT data points exceeded the highest BL data point; while, five out of seven (71.4%) NHT ?I data points exceeded the highest BL data point, and two out of seven (28.5%) NHT ?I data points exceeded the highest NHT data point. Social Validity At the end of the study, the teacher and three target students completed a 9-question social validity questionnaire (Haydon et al. 2010). The questionnaire consisted of 4-point Likert scale responses ranging from 1 (not at all) to 4 (very). Teacher responses (1.0) indicated that NHT was the easiest to implement, while NHT ?I was somewhat more difficult (2.0) to use. A high score (4.0) suggested that the teacher found both NHT strategies as very helpful, that he would be very likely to use the intervention in the future, and that he observed a very large increase in student on-task behavior. Midrange scores (3.0) suggested that the teacher felt the intervention was fairly helpful for student’s reading comprehension, and during both NHT activities the student got along fairly well. Two students (Carmen and RJ) indicated that they liked NHT ?I better than NHT, while Nate expressed a preference for NHT. All three students indicated that they liked the interventions very much (4.0). Midrange scores (M = 2.66; range 2–4) indicated that students thought they were on-task more and participated more (M = 3.0; range 2–4) and that the NHT strategies helped their reading comprehension (M = 3.33; range 3–4).

Discussion The primary purpose of this study was to investigate the differential effects of traditional hand-raising (HR) and two NHT instructional strategies on student on-task behavior and language arts quiz scores. Considering the measures of quiz scores under either NHT condition, mean quiz scores did increase by 29, 23 and 28% for Carmen, Nate, and RJ, respectively, during the study. However, results were mixed on which NHT intervention was more effective. Only for RJ, did the NHT ?I (M = 68.0) clearly show the most effective intervention to improve quiz scores over NHT (M = 48.6) and HR (M = 40.0). For Nate, data indicated a negligible difference in mean quiz scores under either NHT or NHT ?I condition (M = 74.3; M = 75.9), and for Carmen, mean quiz scores were higher under NHT (M = 61.4%) in comparison with NHT ?I (M = 41.7%). For the variable of on-task behavior, the present findings indicated that all three students had noticeable increases in on-task behavior under both NHT conditions as opposed to baseline. RJ, in particular, showed a 60% increase in his on-task behavior under both NHT conditions. As with quiz scores, results were mixed on

123

J Behav Educ (2010) 19:222–238

235

which NHT intervention was more effective. While both interventions appeared to be more effective than BL (hand-raising), neither produced clearly beneficial effects over the other. The lack of differential effects between the NHT and NHT ?I strategies is consistent with previous results (Maheady et al. 2006). Even so, the present study provides a third replication for the positive effects of NHT on pupils’ academic performance and active engagement in class discussions. In addition, study outcomes extend previous results to a different grade level (7th grade), content area (i.e., language arts), and setting (self-contained classroom), and across students with varied special education classifications (EBD, mild intellectual disability, OHI). The study also provided two unique findings. First, students made immediate and noticeable gains in their academic performance. In earlier investigations (Maheady et al. 2002, 2006), student performance increases occurred more gradually. Second, visual analyses indicated relatively stable trends throughout both NHT and NHT ?I conditions, whereas previous outcomes showed downward performance trends over time. Furthermore, for Carmen and Nate there were large downward trends for on-task behavior during BL conditions. Collectively, these data suggested positive intervention effects when using either NHT option; a finding that was consistent with prior investigations. A number of possible explanations exist for the positive effects of both NHT interventions. First, students were given more time to discuss and formulate answers prior to being asked to respond independently as in hand-raising conditions. As such, the ‘‘putting your heads together’’ process may have served as a form of ‘‘precorrection’’ for pupils when formulating responses. Similarly, they were more likely to hear correct responses in small groups as opposed to working individually; particularly since each small group also had average and above average members for that particular class. Third, because students could not predict who would be called on and other team members were depending on them for earning possible rewards (i.e., inter-dependent group contingencies), pupils may have paid better attention while possible answers were being discussed. Finally, students may have performed better under NHT conditions because they had more opportunities to respond (Kagan 1992; Sutherland and Wehby 2001). During BL, for example, the same few students raised their hands to volunteer responses. However, during NHT and NHT ?I sessions, all students were given equal response opportunities (i.e., wrote answers on shared white boards, discussed questions, and were prompted by the teacher to indicate whether they agreed with each answer). The latter explanation is particularly important because researchers have shown that when students’ opportunities to respond are increased, they tend to perform better academically (e.g., Haydon et al. 2010 Maheady et al. 2006). The lack of a powerful effect of the NHT strategies on quiz scores for all three students deserves further attention. While all target students’ quiz scores improved by about two-letter grades, pupils’ overall performance was still below average grade expectations (i.e., C or below grades). On the other hand, the objective of the lesson was to understand literature and provide practice opportunities toward acquisition; therefore, a lower level of performance may be more acceptable. Furthermore, using the NHT procedure shows potential for the application of improved performance on content that appears similar to general education content

123

236

J Behav Educ (2010) 19:222–238

instruction. Implications for teachers may be that these procedures may need to be supplemented with other practice and retention interventions. Finally, even though moderate performance is desirable to compare interventions using an alternating treatments design to avoid ceiling effects, in practice, presenting material that is better matched to student performance level is desirable. An analysis of pupil responses to NHT strategies was quite interesting. First, Carmen and Nate reported that they were somewhat on-task and participated more while NHT was in effect, yet observational data did not confirm these perceptions. RJ’s perceptions, on the other hand, were quite consistent with observational data that suggested that he was on-task more often during NHT. Carmen, a lower achiever, also rated her experiences as more positive than her higher-performing peers (Nate and RJ). This finding is consistent with Ghaith’s (2001) observation that low-achieving learners are more comfortable working in small groups than their more capable peers. The classroom teacher reported that NHT strategies were very helpful, easy to implement, and highly likely to be used in the future. The fact that the teacher used NHT the following year without experimenter encouragement speaks to the intervention’s social acceptability. Although the present findings were positive, there are a few limitations. First, the study is limited by its small sample size, focus on one academic outcome measure, and failure to collect generalization data. Second, there was a lack of clear differential effects for one NHT strategy over the other. On the other hand, the fact that NHT was effective without the behavioral incentive package may make it more feasible to implement and/or more attractive to non-behavioral consumers. A third limitation relates to overall improvements in pupil performance. It was clear that greater academic gains must be achieved for all of these students to succeed in school. Improving their performance is a good first step in the process. Another limitation was the failure to assess directly the verbal and non-verbal interactions that occurred among students. As such, we cannot conclude that all students participated fully in their small groups by discussing relevant academic content. Future research should analyze and code pupils’ verbal and non-verbal behavior during cooperative learning sessions. This is particularly true when target students have had histories of interpersonal difficulties (e.g., students with EBD). It might also be interesting to examine any possible gender differences that emerged during cooperative learning activities. Some researchers (e.g., Lockheed et al. 1983; Webb 1984), for example, have suggested that males in mixed gender groups tend to perceive their female group-mates as less competent. Researchers could differentiate the effects of one NHT strategy over the other by using preferred items or activities as incentives (Kelshaw-Levering et al. 2000; Shaaban 2006). Finally, future researchers should examine the impact of NHT over longer intervention periods. Because reading comprehension skills tend to develop over longer time periods, it may be more appropriate to examine treatment effects over several months (Chapman and Cope 2004; Fielding and Pearson 1994). In conclusion, researchers have demonstrated that cooperative learning strategies such as NHT are powerful instructional strategies that allow teachers to educate students with wide-ranging abilities in various settings such as self-contained and general education classrooms (Slavin 1995). Using NHT strategies has an added

123

J Behav Educ (2010) 19:222–238

237

benefit of improving students’ active participation, social skills, and cooperative skills while reducing disruptive behavior. Furthermore, even without a behavioral incentive package, teachers can use the NHT strategy by itself and thereby improve student social and academic behavior in comparison with their typical teaching strategies. Acknowledgement Preparation of this article was in part supported by the University of Cincinnati’s Research Council Grant.

References Barbetta, P. M., Heron, T. E., & Heward, W. L. (1993). Effects of active student response during error correction on the acquisition, maintenance, and generalization of sight words by students with developmental disabilities. Journal of Applied Behavior Analysis, 26(1), 111–119. Barbetta, P. M., & Heward, W. L. (1993). Effects of active student response during error correction on the acquisition and maintenance of geography facts by elementary students with learning disabilities. Journal of Behavioral Education, 3(3), 217–233. Barlow, D. H., & Hayes, S. C. (1979). Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behavior Analysis, 12(2), 199–210. Campbell, J. M. (2004). Statistical comparison of four effect sizes for single-subject designs. Behavior Modification, 28(2), 234–246. Chapman, E. S., & Cope, M. T. (2004). Group reward contingencies and cooperative learning: Immediate and delayed effects on academic performance, self-esteem, and sociometric ratings. Social Psychology of Education, 7(1), 73–87. Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis. Columbus, OH: Merrill Prentice Hall. Coutinho, M. J. (1986). Reading achievement of students identified as behaviorally disordered at the secondary level. Behavioral Disorders, 11(4), 200–207. Emmer, E. T., Evertson, C. M., & Brophy, J. E. (1979). Stability of teacher effects in junior high classrooms. American Educational Research Journal, 16(1), 71–75. Fielding, L. G., & Pearson, D. P. (1994). Reading comprehension: What works. Educational Leadership, 51(5), 62–68. Ghaith, G. M. (2001). Learners’ perceptions of their STAD cooperative experience. System, 29(2), 289–301. Gunter, P. L., & Denny, R. K. (1998). Trends and issues in research regarding academic instruction of students with emotional behavioral disorders. Behavioral Disorders, 24(1), 44–50. Haydon, T., Conroy, M. A., Sindelar, P. T., Scott, T. M., Barber, B., & Orlando, A. M. (2010). A comparison of three types of opportunities to respond on student academic and social behaviors. Journal of Emotional and Behavioral Disorders, 18(1), 27–40. Hayling, C. C., Cook, C., Gresham, F. R., State, T., & Kern, L. (2008). An analysis of the status and stability of the behaviors of students with emotional and behavioral difficulties: A classroom direct observation study. Journal of Behavioral Education, 17(1), 24–42. Housel, D. J. (2007). Document-based questions for reading comprehension and critical thinking. Westminster, CA: Teacher Created Resources, Inc. Johnson, D. W., & Johnson, R. T. (1999). Making cooperative learning work. Theory into Practice, 38, 67–73. Johnson, D. W., Johnson, R. T., & Holubec, E. (1991). Cooperation in the classroom. Edina, MN: Interaction Book Company. Kagan, D. M. (1992). Professional growth among pre-service and beginning teachers. Review of Educational Research, 62(2), 129–169. Kazdin, A. E. (1982). Single case research designs. New York: Oxford University Press. Kelshaw-Levering, K., Sterling-Turner, H. E., Henry, J. R., & Skinner, C. H. (2000). Randomized interdependent group contingencies: Group reinforcement with a twist. Psychology in the Schools, 37(6), 523–533.

123

238

J Behav Educ (2010) 19:222–238

Kennedy, C. H. (2005). Single-case designs for educational research. Boston, MA: Allyn & Bacon. Kentucky Department of Education. (2008). Kentucky administrative regulations: Special education programs. Frankfort, Kentucky: Kentucky Department of Education. Lane, K., Wolery, M., Reichow, B., & Rogers, L. (2007). Describing baseline conditions: Suggestions for study reports. Journal of Behavioral Education, 16(3), 224–234. Lerner, J., & Johns, B. (2009). Learning disabilities and related mild disabilities: Characteristics, teaching strategies and new directions (11th ed.). NY: Houghton Mifflin Harcourt Publishing Company. Lockheed, M. E., Abigail, M. H., Harris, A. M., & Nemceff, W. P. (1983). Sex and social influences: Does sex function as a status characteristic in mixed-sex groups of children? Journal of Educational Psychology, 75(6), 877–888. Maheady, L., Mallette, B., Harper, G. F., & Sacca, K. (1991). Heads together: A peer- mediated option for improving the academic achievement of heterogeneous learning groups. Remedial and Special Education, 12(2), 25–33. Maheady, L., Michielli-Pendl, J., Harper, G. F., & Mallette, B. (2006). The effects of numbered heads together with and without an incentive package on the science test performance of a diverse group of sixth graders. Journal of Behavioral Education, 15(1), 25–39. Maheady, L., Michielli-Pendl, J., Mallette, B., & Harper, G. F. (2002). A collaborative research project to improve the academic performance of a diverse sixth grade science class. Teacher Education and Special Education, 2(1), 55–70. Mooney, P., Epstein, M. H., Reid, R., & Nelson, J. R. (2003). Status and trends of academic intervention research for students with emotional disturbance. Remedial and Special Education, 24(5), 273–287. Mortweet, S. L., Utley, C. A., Walker, D., Dawson, H. L., Reddy, S. S., Greenwood, C. R., et al. (1999). Classwide peer tutoring: Teaching students with mild retardation in inclusive classrooms. Exceptional Children, 65(4), 524–536. Nelson, J. R., Johnson, A., & Marchand-Martella, M. (1996). Effects of direct instruction, cooperative learning, and independent learning practices on the classroom behavior of students with behavioral disorders: A comparative analysis. Journal of Emotional and Behavioral Disorders, 4(1), 53–62. No Child Left Behind Act of 2001, Pub. L. No. 107–110, 115 Stat. 1425 (2002). Putnam, J. W. (1998). Cooperative learning and strategies for inclusion: Celebrating diversity in the classroom. Baltimore, ML: Brooks Publishing Inc. Ryan, J. B., Reid, R., & Epstein, M. H. (2004). Peer-mediated intervention studies on academic achievement for students with EBD: A review. Remedial and Special Education, 25(6), 330–341. Shaaban, K. (2006). An initial study of the effects of cooperative learning on reading comprehension, vocabulary acquisition, and motivation to read. Reading Psychology, 27(5), 377–403. Slavin, R. E. (1995). Cooperative learning: Theory, research and practice (2nd ed.). Boston: Allyn and Bacon. Sutherland, K. S., & Wehby, J. H. (2001). Exploring the relationship between increased opportunities to respond to academic requests and the academic and behavioral outcomes of students with EBD: A review. Remedial and Special Education, 22(2), 113–121. Sutherland, K. S., Wehby, J. H., & Gunter, P. L. (2000). The effectiveness of cooperative learning with students with emotional behavior and behavioral disorders: A literature review. Behavioral Disorders, 25(3), 225–238. Trout, A. L., Lienemann, T. O., Reid, R., & Epstein, M. H. (2007). A review of non-medication interventions to improve the academic performance of children and youth with ADHD. Remedial and Special Education, 28(4), 207–226. Tucker, M., Sigafoos, J., & Bushell, H. (1998). Use of noncontingent reinforcement in the treatment of challenging behavior: A review and clinical guide. Behavior Modification, 22(4), 529–547. U. S. Department of Education. (2001). Twenty-third annual report to Congress on the implementation of the Individuals with Disabilities Education Act. Washington, DC: U. S. Department of Education. Wanzek, J., & Vaughn, S. (2009). Students demonstrating persistent low response to reading intervention: Three case studies. Learning Disabilities Research & Practice, 24(3), 151–163. Webb, N. (1984). Sex differences in interaction and achievement in cooperative small groups. Journal of Educational Psychology, 76(1), 33–44. Wehby, J. H., Lane, K. L., & Falk, K. B. (2003). Academic instruction for students with emotional and behavioral disorders. Journal of Emotional & Behavioral Disorders, 11(4), 194–197.

123