CRM - CiteSeerX

9 downloads 108834 Views 196KB Size Report
HEEDING THE CALL: EVALUATION OF CRM TRAINING. Our review resulted ..... example, Kayten (1993) cites several examples of reports by the NTSB in which good. CRM practices were ..... NASA-Ames Research Center. Hansberger, J. T. ...
Team Training in the Skies: Does Crew Resource Management (CRM) Training Work?

Eduardo Salas C. Shawn Burke Clint A. Bowers Katherine A. Wilson University of Central Florida Orlando, FL, USA

Key Words:

crew resource management, teamwork, aviation, team training, training evaluation, multi-level evaluation, safety

Shortened Title: Team Training in the Skies

The aviation community has invested great amounts of money and effort into crew resource management (CRM) training. Using Kirkpatrick’s (1976) framework for evaluating training, we reviewed 58 published accounts of CRM training to determine its effectiveness within aviation. Results indicated that CRM training generally produced positive reactions, enhanced learning, and desired behavioral changes. However, we cannot ascertain whether CRM has an effect on an organization’s bottom line (i.e., safety). We discuss the state of the literature with regard to evaluation of CRM training programs and, as a result, call for the need to conduct systematic, multi-level evaluation efforts that will show the true effectiveness of CRM training.

Address correspondence to:

Dr. Eduardo Salas Department of Psychology University of Central Florida P.O. Box 161390 Orlando, FL 32816-1350 (407) 823-2552(w); 823-5862 (fax) [email protected]

2 Team Training In The Skies: Does Crew Resource Management (CRM) Training Work? It is well acknowledged that 60-80% of accidents and mishaps occurring in aviation have been attributed to human error (Freeman & Simmon, 1991) . A large part of these are due to failures in coordination among cockpit crews. For example, poor pilot performance and faulty crew resource management (CRM) have been cited as contributing factors in numerous accidents and incidents reported by major airlines during the time period covering 1983-1985 (U. S. GAO, 1997). In addition, CRM deficiencies (e.g., lack of coordination among cockpit crews, captain’s failure to assign tasks to other members, and a lack of effective crew supervision) were a contributing cause in approximately ½ of the above reported accidents that involved one or more fatalities (U.S. GAO, 1997). Other reviews have found similar factors at work within cited accident reports (see Leedom & Simon, 1995; Chidester, Helmreich, Gregorich, & Geis, 1991; Gregorich, Helmreich, & Wilhelm, 1990). Within the aviation environment, teamwork deficiencies are not only embarrassing and highly publicized, but can lead to tragic consequences. For example, Eastern Airlines Flight 401 crashed in the Florida Everglades in December 1972, because the crew permitted their fully operational Lockheed L-1011 to fly into the ground. What the crew failed to realize was that the altitude hold feature of the autopilot had been accidentally disconnected (as cited in Kayten, 1993). Results of the investigation revealed that the entire three-person crew was pre-occupied with a landing gear light that had failed to illuminate at the time of the accident. Many other aviation accidents resulting in disastrous consequences have also been attributed to faulty CRM skills (Allegheny Airlines, 1971; 1978; Mohawk Airlines, 1972; United Airlines, 1978; and others as cited in Kayten, 1993; U.S. GAO, 1997). In an effort to manage some of these problems with teamwork and the resulting safety issues, the aviation industry introduced the concept of CRM (Wiener, Kanki, & Helmreich, 1993; Salas, Bowers, & Edens, 2001). CRM was introduced as a way to train aircrews to use all available resources—equipment, people, and information—by communicating and coordinating as a team. At this point in time, CRM has been used within the aviation industry for over 20 years, and has undergone several evolutions with varying foci (Helmreich, Merritt, & Wilhelm, 1999; Helmreich & Foushee, 1993; Maurino, 1999). Specifically, during the first evolution, the emphasis was on changing individual styles and correcting deficiencies in individual behavior, with a heavy focus on psychological testing. The second evolution represented a focus on cockpit group dynamics, was more modular, and dealt more with specific aviation concepts related to flight operations. With the third evolution came a broadening of scope. Specifically, training began to recognize the characteristics of aviation systems in which crews must function, as well as expanding to areas outside the cockpit (e.g., cabin crews, maintenance personnel). With the fourth generation came integration and proceduralization. Specifically, under the Advanced Qualification Program (AQP), carriers were allowed to tailor training to fit the needs of their specific organization, they were required to provide both CRM and line-orientated flight training (LOFT) to all crews, and CRM training was integrated with technical training. The fifth and latest evolution represents an awareness that human error is inevitable and can provide a great

3 deal of information. CRM is now being used as a way to try and manage these errors by focusing on training teamwork skills that will promote: (1) error avoidance, (2) early detection of errors, and (3) minimization of consequences resulting from CRM errors. Programs are beginning to go beyond error management to include a focus on threat recognition and management. The evolutions that CRM has witnessed have occurred over roughly two decades of use within the aviation community; research during this time has produced several lessons learned (see Salas, Bowers, & Edens, 2001). For example, research has yielded information about how to maximize the design and delivery of CRM training through scenario design (Prince, Oser, Salas, & Woodruff, 1993; Prince & Salas, 1999), scenario feedback (Salas, Rhodenizer, & Bowers, 2000; Prince, Brannick, Prince, & Salas, 1997), and the training of operational personnel as observers and raters (Brannick, Salas, & Prince, 1997). In this vein, Helmreich and Wilhelm (1987) found that the systematic training of raters in CRM concepts made a significant difference in the quality of ratings and scale use (e.g., raters using entire scale) as compared to raters who were trained less systematically (as cited in Helmreich, Chidester, Foushee, Gregorich, & Wilhelm, 1990). Evidence has also been provided that suggests that low fidelity simulations can be used to practice/train CRM-related skills (Bowers, Salas, Prince, & Brannick, 1992; Baker, Prince, Shrestha, Oser, & Salas, 1993; Jentsch & Bowers, 1998). Furthermore, we have learned that national culture plays a powerful role in determining the effectiveness of CRM training programs (Maurino, 1994; Merritt & Helmreich, 1995b). Specifically, we have learned that attitudes that define the core concepts of CRM differ dramatically across national borders (e.g., individualism/ collectivism, power distance, uncertainty avoidance, and division of roles between sexes, see Hofstede, 1988). As such, initial attempts to apply CRM globally were often unsuccessful due to a failure to recognize the power of national culture (Helmreich, Wilhelm, Klinect, & Merritt, in press). Finally, we know that anecdotal evidence, as well as reactions to CRM training, generally suggests that CRM training can prevent accidents (see Diehl, 1991; Kayten, 1993), and that CRM is being applied in domains outside aviation (see Flin, 1995; Howard, Gaba, Fish, Yang, & Sarnquist, 1992; Merritt & Helmreich, 1995a). Although positive lessons have been learned, there are areas in need of improvement. To begin with, we know that, despite its long history, there remains a lack of consistency within the aviation industry with regard to definitions of CRM, training content, and methods of delivery (Wilhelm, 1991; Helmreich & Wilhelm, 1987; Salas, Prince, Bowers, Stout, Oser, & Cannon-Bowers, 1999). Second, we know that with the development of the Advanced Qualification Program (AQP) guidelines, each individual airline now has two choices in deciding how they want to implement CRM training: (1) traditional requirements as mandated by Federal Aviation Regulation (FAR) part 121, or (2) by using the AQP guidelines. FAR part 121 states the general operating requirements for domestic, flag, and supplemental operations, and contains the general requirements for CRM training. However, it leaves methods for CRM curriculum design and development as well as for evaluation ambiguous. In an effort to resolve the lack of guidance offered by the FAA regarding CRM training, the AQP guidelines were created. Under the AQP guidelines, airlines are provided with: (1) a process for developing the curriculum in order to integrate traditional CRM programs with technical training, (2) the required level of performance that must be achieved, and (3) guidelines that provide

4 inspectors with evaluation criteria to determine if the curriculum meets all FAA requirements (U.S. GAO, 1997; Helmreich et al., 1999). Despite the creation of AQP guidelines, airlines are not required to use them in replacement of FAR 121 requirements – therefore, there is still much ambiguity and a lack of consensus when it comes to the design, delivery, and evaluation of CRM training programs. Although this ambiguity is surrounding the design (i.e., content) and evaluation of these programs, the aviation community continues to invest millions of dollars into CRM training and other communities (e.g., medical, offshore oil, maritime shipping companies) are now starting to “jump on the bandwagon”. Given the increased adoption of CRM as a worthwhile team training approach, there is a need to summarize the current state of knowledge about the effectiveness of CRM training in a systematic manner. That is, we need to assess the effectiveness of the implemented team training systems. Therefore, the purpose of this paper is to use Kirkpatrick’s (1976) typology for training evaluation, as a framework to evaluate the effectiveness of CRM training programs in aviation. Specifically, the review is organized via the type of evidence collected after training (i.e., reaction, learning, behaviors, and/or organizational effectiveness). THE NEED FOR EVALUATION Training evaluation has been defined as “the systematic collection of descriptive and judgmental information necessary to make effective training decisions related to the selection, adoption, value, and modification of various instructional activities” (Goldstein, 1993, p. 147). Although it is acknowledged that systematic training evaluation is not an easy task, it is the only way to ensure that training programs are having the desired effect and are a worthwhile investment for the organization. Similarly, Cannon-Bowers et al. (1989) have argued that training evaluation may serve a number of important functions. First and most obvious, program evaluation results can indicate whether the goals and objectives of a program are appropriate to achieve the desired outcome. Second, evaluation can indicate whether the content and methods used in training will result in achievement of the overall program goal. Third, evaluation data can be used to determine how to maximize transfer of training. Fourth, it can serve as feedback at both the individual and team level to suggest areas in need of improvement or revision. Goldstein (1993) offers similar arguments as to the benefits of evaluation. The most popular framework for guiding training evaluations is Kirkpatrick’s (1976) typology. Kirkpatrick argued for a multi-level approach to training evaluation consisting of four levels of evaluation: (1) reactions, (2) learning, (3) behavior (i.e., extent of performance change), and (4) results (i.e., degree of impact on organizational effectiveness or mission success). Within recent years, this typology has been expanded by several researchers (see Kraiger, Ford, & Salas, 1993; Salas & Cannon-Bowers, 2001). For example, Kraiger et al. (1993) expanded Kirkpatrick’s typology by arguing that learning is multi-dimensional and results in cognitive, affective, and skill-basedlearning outcomes. Moreover, Kraiger et al. suggest potential methods that can be used to evaluate each of these outcomes: (1) cognitive (verbal knowledge, knowledge organization, cognitive strategies), (2) affective (attitudinal, motivational), and (3) skillbased (compilation, automaticity). Goldsmith and Kraiger (1997) have built upon this work by describing a method for the structural assessment of an individual learner’s knowledge and skill, which has been successfully used in aviation research efforts (see Kraiger, Salas, & Cannon-Bowers, 1995; Stout, Salas, & Kraiger, 1997). The utilization

5 of Kirkpatrick’s typology and corresponding revisions, serves several important functions within the training evaluation process. First, it has served to organize the type of information that should be collected in the assessment of training. Second, it has served to argue for the added benefit/importance of collecting more than one level of evaluation information. Although both points perform important functions, the second has been more difficult to put into practice than the first. Specifically, Alliger and Janak (1989) reported that less than 10% of organizations assess training programs at all four levels of evaluation, as argued for by Kirkpatrick. HEEDING THE CALL: EVALUATION OF CRM TRAINING Our review resulted in the identification of 58 studies that appeared to evaluate the effectiveness of aviation CRM training programs. We next provide a description of the state of CRM evaluation efforts with respect to each of the levels of evaluation as identified by Kirkpatrick (1976). Specifically, studies that assessed training at only one level will be reviewed first, beginning with those collecting reaction data and ending with those collecting results/organizational effectiveness data. Following this will be a brief review of studies that assessed training at multiple levels, as argued for by Kirkpatrick. For a summary of individual studies see Table 1, which describes the findings of each of the identified studies in relation to reactions, learning, behavioral, and organizational effectiveness data. Do Aviators Like CRM?: Reaction Evidence Reaction evidence is the first level of Kirkpatrick’s (1976) typology and amounts to an assessment of trainee’s feelings toward the training program. Reaction data are assessed post training and examine the degree to which participants perceive that training was worthwhile, relevant, interesting, and/or well conducted. Reaction evidence is perhaps the easiest to collect and usually takes the form of a paper-and-pencil questionnaire where the response format is typically a Likert scale. A small sample of studies have also gathered reaction data through the utilization of questionnaires that ask participants to rank order the perceived usefulness of the CRM components included in training. After reviewing the available literature, reaction data is a commonly collected type of evaluation data. Specifically, of the studies included in our review, 27 of 58 (46%) involved the collection of reaction data (see Table 1). Of these 27 studies, 9 collected information solely related to participant reactions. Furthermore, the reaction data that was collected in the reviewed studies tended to reflect both affective feelings towards CRM programs, as well as the utility of these programs. Alliger, Tannenbaum, Bennett, and Traver (1997) argue that “liking of training” is the most common form of training assessment, and the results from the current review tend to support this argument in that 12 of the 27 studies (44%) assessed participants’ affective reactions towards training. Overall, the results of these studies suggest that most participants like CRM training. In addition to assessing the overall affective reactions to training, some studies (e.g., Baker et al., 1993; Horman, Goeters, Maschke, & Schiewe, 1995) also assessed affective reaction to particular components of CRM training. For example, Schiewe (1995) found that units that were based on case studies or used role play were very well liked by participants, while those based mostly on lecture were not rated favorably. Others have found similar findings (see Baker, Bauman, & Zalesny, 1991), suggesting

6 that perhaps methods that promote interaction among participants are liked better than those that are more passive. Reactions as to how participants “liked” training, are not the only type of reactions that may be collected from participants. Alliger et al. (1997) also suggest assessing participant reactions to the utility of training. Questions related to the assessment of utility,“…attempt to ascertain the perceived utility value, or usefulness, of training for subsequent job performance” (p. 344). Nine of the 27 studies (33%) assessed the utility of training, while the remaining 7 studies assessed both affective and utility reactions. In terms of utility, the reviewed CRM training programs were seen to be worthwhile, useful, and applicable. Specifically, themes were seen as relevant (Grau & Valot, 1997; Horman et al., 1995) and participants felt that CRM class should be expanded to other fleets/populations (Incalcaterra & Holt, 1999). See Table 1 for further information. Positive reactions to CRM (affective, utility) were found to exist in single airline-studies, as well as in multi-fleet and multi-airline studies (see Butler, 1993; Helmreich & Wilhelm, 1991). Furthermore, the teaching of teamwork behaviors (e.g., communiation, decision-making, leadership [see Alkov, 1991; Alkov & Gaynor, 1991]), use of roleplaying exercises (see Baker et al., 1991), and inclusion of cabin crewmembers in training (see Vandermark, 1991) have all contributed to obtaining positive affective and utility reactions in training. Although the aviation community should be applauded for beginning to assess both affective reactions to training, as well as the perceived utility of training, there are a few suggested areas of improvement. First, in the assessment of the perceived utility of training, very few studies were found to actually ask participants how they would apply the newly learned behaviors back on the job (i.e., specific instances as to how/when these newly acquired skills might be beneficial to use/fit in). Second, Goldstein (1993) has argued for guidelines that should be followed in developing assessment of participant reaction, yet some of the reviewed studies seemed to fall short on these. For example, although Goldstein recommends that responses should be able to be tabulated and quantified, most of the studies reviewed, or at least the data presented, were not in a format where it was apparent that the data were able to be quantified. Typical evidence provided were things such as, “most of the participants reported liking the training”, or “a few selected cases reported liking the training”. A second concern is that while many studies reported using a Likert type scale to assess participant reaction there were a lack of studies that assessed the reliability of the scales used. A final concern is that many of the reviewed studies did not mention the specific components of CRM that were trained within the evaluated program. As CRM training is still not uniformly taught, the delineation of the particular skills taught is important in attempting to make sense of presented findings, as well as determining the extent to which findings may generalize. This last comment is not so much a critique of the data collection process as it is the dissemination of results. Summary. Despite the shortcomings mentioned above, the studies that assessed participant reactions to CRM training provide sufficient rigor and converging evidence to lead to the conclusion that CRM training in aviation settings does produce positive reactions. For the next part, aviators like and perceive that CRM training is worthwhile and useful to the safe conduct of their tasks. Although this is the simplest form of evaluation criteria, it serves an important purpose. Positive reactions to training are

7 crucial in that they can provide an avenue by which to garner the top level support that is essential for a program’s lasting success, provide evidence of a program’s credibility, and may enhance trainee motivation to learn. Conversely, negative reactions may point to areas of training that need to be revised, as well as providing insight into why desired changes at other levels of evaluation (e.g., learning, behavior) have not occurred (Orlady & Foushee, 1987). Do Aviators Learn About CRM?: Learning Evidence In a multi-level evaluation effort, learning evidence is the second level of evaluation, and it refers to “the principles, facts, and skills which were understood and absorbed by participants” (Kirkpatrick, 1976, p. 11). Although evidence at this level includes the learning that occurred during the program, it does not include the actual exhibition of learned behaviors (i.e., skills). Also included in this level is the extent to which training leads to desired attitude changes (i.e. positive attitudes towards CRM training). It is considered here because not only are both learning and attitude changes cognitive events, but both processes mediate performance, and as such, should be evaluated together. The bottom line is that assessment of learning criteria provides evidence as to how successful the training program was in imparting the targeted knowledge, skills, and attitudes, as well as providing the basis for feedback and areas in need of further refinement. Within the reviewed studies, 52% (30 of 58) collected information related to participant learning, with 11 of these studies solely assessing learning criteria (see Table 1). Within these efforts, the most common type of evidence offered in support of CRM affecting learning was changes in attitudes regarding CRM. Evaluation of attitudes was usually done by collecting information with the Cockpit Management Attitudes Questionnaire (CMAQ; Helmreich, 1984), or a modification of this instrument. The CMAQ is composed of three major scales: (1) communication and coordination (i.e., communication of intent and plans, delegation of tasks, assignment of responsibilities, and monitoring of crewmembers), (2) command responsibility (i.e., leadership), and (3) recognition of stressor effects (i.e., consideration and compensation for stressors). Overall, studies that assessed learning via attitude change seem to indicate that CRM training can produce positive changes in attitudes that are somewhat stable given top management support (see Table 1). For the most part, CRM programs seem to produce positive examples of participant learning, primarily as indexed by attitude change (see Table 1), however there are a few reported instances of CRM programs achieving a “boomerang effect” (e.g., instances of negative attitude change; Helmreich, 1991). One study found that personality type influenced whether participants had positive or negative attitude change (i.e., boomerang) as a result of CRM training (Chidester et al., 1991). The remaining studies reporting this effect, however, did not report the possible cause of the negative attitudes (see Irwin, 1991). For example, was this boomerang effect due to: (1) something within the training itself, (2) crews having very high, positive attitudes prior to training (Chidester et al., 1991), or (3) personality or cultural aspects which may have played a role (Helmreich, 1991; Chidester et al., 1991). Despite evidence of a “boomerang effect” with some participants, the preponderance of evidence seems to suggest that the majority of participants attending CRM training do learn in the sense that there is typically a positive change in targeted attitudes.

8 Although the assessment of learning via attitude change is the most popular form of assessing learning, a few studies used other methods (see Hayward & Alston, 1991; Incalcaterra & Holt, 1999; Salas, Fowlkes et al., 1999). For example, Hayward and Alston (1991) reported that as a result of CRM workshops there was an increased awareness of: human factors, crew performance, and potential stressors, as well as methods by which to handle these stressors. Assessing another form of learning (i.e., knowledge acquisition), Salas, Fowlkes et al. (1999) found that as compared to teams not trained in CRM, CRM trained groups exhibited higher levels of knowledge regarding CRM principles. Finally, several studies by Salas and colleagues (see also Stout, Salas, & Kraiger, 1997) have shown the positive effects of CRM training by assessing learning via a change in participant knowledge structures (i.e., mental models). Although the overall picture seems to suggest that CRM training does have a positive impact on participant learning, another important factor to consider is the stability of these learning changes over time. Although there have been many cited studies (see Table 1) that show CRM affecting initial learning/change, there have been fewer studies that have assessed the long term stability of these changes. The few studies that have examined whether the changes produced by training programs remain stable over time have found varying results. For example, although some have indicated that positive attitudes are stable anywhere from two (Incalcaterra & Holt, 1999) to five years out (Byrnes & Black, 1993), others have reported that initial attitude change produced by CRM programs declines over time, regressing towards pre-CRM levels (Irwin, 1991; Helmreich, 1991; Helmreich et al., 1999; Gregorich, 1993). Empirical results have suggested that one factor contributing to whether attitudes decline over time is whether management reinforces and supports the knowledge, skills, and attitudes learned in CRM programs or only provides “lip" service to material learned in CRM programs (Gregorich, 1993; Helmreich, 1991; Helmreich et al., 1999). Recent work by Helmreich and colleagues has begun to examine the impact of organizational culture, including the safety culture within an organization, on the initial learning of targeted CRM knowledges, skills, and attitudes, as well as the stability of changes over time (see Helmreich et al., in press). Summary. Similar to the assessment of reaction data, the overall picture provided by learning criteria seems to suggest that CRM training is effective in producing changes in aviator knowledge and attitudes. This conclusion is further supported in that although the predominant form of collecting learning data is through the assessment of attitude change, several studies assessed other forms of learning as well (e.g., knowledge structures, paper-and-pencil tests), and found evidence of learning. The positive evidence offered by multiple measures of learning makes a stronger case for the effectiveness of CRM on learning. Although heading in the right direction, evaluation efforts must continue to strive for the inclusion not only of measures examining attitude change, but measures of declarative knowledge, as well as the assessment of knowledge structures and shared mental models (see Cannon-Bowers et al., 1989; Kraiger et al., 1993). Do Aviators Apply The Learned CRM Behaviors in the Cockpit?: Behavioral Evidence Behavioral evidence, the third level identified by Kirkpatrick (1976), provides an assessment of whether the lessons/knowledge learned in training transfer(s) to actual behavior on the job or a similar simulated environment. It has also been argued to

9 indicate: (1) the extent to which trainees learned how to perform the knowledge, skills, and attitudes (KSAs) taught during training, as well as when to apply these skills, and (2) an indication of trainee readiness and overall program effectiveness (Cannon-Bowers et al., 1989). Of the 58 studies reviewed for this effort, 32 (55%) gathered some type of behavioral data, as defined by Kirkpatrick (1976). Furthermore, 12 of the reviewed studies collected information solely at this level. The reader is referred to Table 1 for a specific breakdown of studies that collected behavioral evidence. Within the reviewed studies, the most common method of assessing behavioral change was through the measurement of CRM-related behaviors while participants performed line-orientated simulation, such as LOFT. More specifically, this type of assessment was evident in 18 of the 32 studies (56%). Less common was behavioral assessment as measured by online assessment of behavior (11 of 32), although a small subset of studies did evaluate both behavior in line-orientated simulators, as well as on-line behavior (3 of 32). The studies that collected behavioral data tended to use a combination of the following tools: behavioral observation forms, behavioral checklists, analysis of crew communication (via real time ratings or post-hoc analysis of video-tapes), and peer or self evaluations/reports. The predominant number of studies reviewed that collected some form of behavioral evidence indicated that CRM training had a positive impact on behavior (see Table 1). Specifically, results tended to indicate that CRM trained crews tended to exhibit: (1) improved performance as measured by behaviors indicative of CRM (e.g., decision making, mission analysis, adaptability, situation awareness, communication, leadership) or (2) improved performance as compared to crews not trained in behaviors indicative of CRM. For example, in a study by Leedom and Simon (1995) results indicated that after receiving CRM training, crews exhibited improved team communication patterns, more efficient management of crew resources, fewer team errors, and improved team coordination. Summary. Behavioral evidence has been argued to be highly valuable in determining the effectiveness of a training program because it provides a means to assess whether training participants can actually translate the knowledge, behaviors, and attitudes learned in training into action. Overall, the behavioral evidence we reviewed suggests rather strongly that CRM training does have an impact on behavior (primarily as evidenced through LOFT or similar evaluations). Aviators do exhibit more teamwork behavior in the cockpit. And presumably these will lead to safer outcomes. Although more evaluations were found at the behavioral level than initially expected, most of these evaluations have been conducted in simulated environments rather than actually “on the job”. Although resources may put restrictions on the ease with which behavioral data are captured back on the actual job, as opposed to simulated conditions, behavior on the actual job would provide even stronger support for the suggested effectiveness of CRM training with regard to behavioral change. However, behavioral data collected during simulated situations is definitely a close surrogate and a welcome start in the right direction. Are The Skies Safer?: Results/Evidence of Organizational Impact Organizational impact (i.e., increased safety, less errors) is the highest level of evaluation in Kirkpatrick’s (1976) framework. Although this type of evidence is highly valued, very few evaluations are conducted at this level—only 10% (6 studies) of the

10 reviewed studies collected evaluation data at this level (see Table 1). Due to the difficulty of collecting this type of information (in terms of time, resources, identification of a clear criterion, and low occurrences of accidents and mishaps), evidence of CRM’s training impact on the organization as a whole is not often sought, nor obtained. Specifically, evidence of this type is difficult to collect because it generally requires some type of longitudinal data, as it takes time for the impact of training to appear at the organizational level. In addition, criterion measures are difficult to identify and it is hard to control the various extraneous variables that may influence (e.g., moderate, mediate) the relationship between CRM training and organizational effectiveness. Of the six studies that collected some form of results measure, two collected information on organizational effectiveness alone, while the others collected additional types of evaluation evidence also (more on this later). Within the reviewed studies that collected data at the organizational level, most of the information tended to come from one of two sources: (1) anecdotal evidence (e.g., accident reports, incident reports) or (2) longitudinal studies. The predominant type of evidence that has been used to illustrate CRM’s impact on aviation safety is anecdotal reports contained in incident or accident investigations conducted by the National Transportation Safety Board (NTSB). For example, Kayten (1993) cites several examples of reports by the NTSB in which good CRM practices were reported to limit the detrimental effects of either human or mechanical error. Although anecdotal reports contained in accident reports are perhaps the most common and easiest evidence of organizational impact to collect, there are some problems with accident data that argues for other types of organizational effectiveness measures. Perhaps the most predominant is the rarity with which accidents happen (see Maurino, 1999; Gregorich & Wilhelm, 1993; Helmreich & Foushee, 1993; Salas, Prince, et al., 1999), as such the investigation of incidents (as opposed to accidents) has been suggested. The other source of evidence used within the reviewed studies to assess the impact of CRM training on organizational effectiveness, is based on the longitudinal collection of data. For example, Byrnes and Black (1993) evaluated a CRM program implemented at Delta Airlines and found indications of CRM’s impact on organizational effectiveness—quarterly air carrier discrepancy reports were found to significantly decrease after CRM training was implemented. Although assessing organizational impact, the above study was designed with no control group to act as a comparison against potential intervening environmental confounds across the years referred to within the air discrepancy reports. As such, although the evidence provided is positive, it is not conclusive – more controlled studies need to be conducted. Summary. So, what can we conclude from the reviewed evidence? Unfortunately, not much. Although anecdotal reports indicate that CRM behaviors may contribute to reducing the impact of human and mechanical error within the aviation community, much stronger evidence is needed. And this is easier said than done. There are a number of difficulties for establishing a clear cause and effect between CRM and safety. For example, various factors may intervene between the point at which a CRM program is implemented and the assessment of the program’s impact, making inference somewhat problematic. Clearly, evaluation efforts that systematically track the impact of a particular CRM program on safety provide a stronger argument than anecdotal reports. The bottom line appears to be that although there is evidence of a positive trend regarding

11 the impact of CRM on safety more and better evaluations need to be conducted. And, in fact, of the evaluations reported, these need to be more detailed—for there were several evaluations that were found in the course of the literature search that, while hinting at the impact of CRM on a particular subset of crews, did not provide enough information to reach any real conclusions. Specifically, the information presented was either ambiguous or very general in nature; as such, those efforts were not included in the current review as it was felt that the results were too speculative. Multi-Level Evaluation Efforts Reviewing the 58 identified studies via Kirkpatrick’s typology has suggested that when each level of training evaluation criteria is considered independently, there is a fair amount of evidence to suggest that CRM training does have a positive impact on participant reaction, learning, and behavior, although the impact on safety can not yet be determined. However, the above conclusions are, in some sense, based upon weak data, for in order to truly assess whether CRM training is effective, evaluation efforts should focus on collecting information from multiple levels (e.g., reaction, learning, behavior, results). The collection of multi-level assessment data is important in that it provides a cleaner, more complete picture of training efficacy because the evaluator is not looking at one piece of evidence in isolation. In order for training to be truly effective, it must impact participant learning, learning must transfer to behavior, and behavior must transfer to a difference at the organizational level. The assessment of participant reactions is also an important part of the multi-level assessment, for reactions provide an initial check as to whether training is relevant to the knowledge, skills, and abilities needed on the job, as perceived by participants. In addition, reaction data serves as an important piece of information in that “liking” the training will serve to motivate participants in the learning process. Of the 58 identified studies, 24 (41%) collected information at multiple levels of Kirkpatrick’s (1976) typology. Of these 24, most collected information at only two levels (13 studies), typically the lower levels (e.g., reaction-learning, reaction-behavior-, see Table 1). Although a review of those studies indicated that CRM training generally produced positive results, assessment at only two levels still provides a limited view of the overall effectiveness of the program. A clearer, more complete picture would be provided by either collecting information at more levels or at least the higher levels (i.e., behavior-results). There was a fair number of studies that evaluated training programs at three of Kirkpatrick’s four levels (10 studies), but only one study was found that evaluated training at all four levels (see Table 1). As the studies that assess three or four levels of Kirkpatrick’s typology provide the clearest picture as to the actual effectiveness of CRM training, a brief overview of the value added by these studies will be described. The studies that assessed at least three levels of evidence attest to the fact that the various types of evaluation evidence provide different pieces of information, which may lead to a different overall conclusion as to the effectiveness of CRM training. For example, Smith (1994) found that when participant reactions and learning (attitudes) were assessed, a self analysis technique used to develop CRM skills in 10 undergraduate flight students was not highly valued and produced little change in attitudes. However, when behavior was assessed, it was found that self analysis did have an impact in that it helped crews to perform significantly better in three LOFT sessions and moderately

12 better in three others. Another example of the benefit of conducting multiple levelevaluation are the results of a study conducted by Stout et al (1997). Specifically, these researchers found positive reactions to training, evidence of learning via changed knowledge structures, and evidence of behavioral change in that trained participants performed an average of 8% more desired CRM behaviors than a control group. However, when learning was assessed via attitude change, there was a positive, but nonsignificant change. Had either of the above studies only collected single-level evaluation data, they might have come to different overall conclusions as to how to improve the implemented training program. Alkov and colleagues (see Alkov, 1991; Alkov & Gaynor, 1991) provided the only reviewed study that evaluated CRM training at all four levels of Kirkpatrick’s (1976) typology. Specifically, they evaluated a CRM training program targeted at 45 naval aviation training squadrons—helicopters, attack bombers, and multi-placed fighters. Results suggest that squadrons reported the training as useful and they indicated a desire for training to continue. Positive attitude changes regarding CRM training were noted, and squadron commanding officers reported that training was found to contribute to better communication between instructors and students. Finally, some evidence of organizational impact was found in that following CRM training, overall aircrew mishap rate declined in all three communities. However, it should be noted that this study is not meant to serve as an example to others as a way of conducting multi-level evaluations, as it is a preliminary analytical effort and hence many of the above conclusions offered are tentative. We do however commend their efforts in attempting to evaluate CRM training at all four levels. Summary. It has been said that in the absence of a well-defined, measurable, “ultimate” criterion (that rarely exists in the real world), it is important to assess training at multiple levels for each additional source of data serves to increase confidence in the overall evaluation (Cannon-Bowers et al., 1989). For example, Although reaction data can indicate whether the trainee felt the program was worthwhile, it has little if any relation to whether the participant learned the material. Similarly, just because a trainee learned the knowledge during training does not guarantee that he/she can translate this knowledge into effective behavior, nor does it guarantee that, if applied, the behavior will have an effect on organizational outcomes. Each source of data provides a limited picture of the results. The studies reviewed above indicate that this advice, concerning the assessment of multiple-levels, is beginning to take hold within the aviation community. However, all the parties involved (aviators, researchers, regulators, the public, the airlines) need to continue to push for assessment at all levels of evaluation. Although the efforts reviewed within this paper suggest that CRM training programs are generally effective in producing some level of change in participants (e.g., reaction, learning, behavior), the lack of multi-level evaluation efforts makes it difficult to answer whether CRM is truly effective. Specifically, of the 58 reviewed studies, the predominant number of studies (34) collected data at only one level, whereas the remainder broke down in the following manner: 13 studies collected data at two levels, and 10 studies collected data at three levels, as argued for by Kirkpatrick (1976) and others (Robertson, Taylor, Stelly, & Wagner, 1995; Cannon-Bowers et al., 1989). In order to be truly effective and worth the investment companies and airlines make into CRM programs, positive reactions must

13 transfer to learning, learning must transfer to behavior, and finally changes in behavior must translate into reducing aviation related mishaps and accidents. We recognize that achieving four levels of evaluation might be impractical, if not impossible in many situations. But we need to establish stronger links between CRM training and reduction of accidents. At this point in time, there are not enough multi-level evaluations to assess whether or not this link is there. DOES CRM TRAINING WORK? As reported by Salas et al (1999) the data are encouraging. Although some have previously argued that there is no evidence that CRM is effective (Besco, 1995, 1997, 1998; Simmon, 1997; Komich, 1997), this review concludes that some evidence does exist. And this is important. The picture that has emerged after reviewing the existing evidence within the current framework suggests that CRM training is effective. But as stated earlier, the picture is not as clear as it should be after 20 years. The lack of systematic studies that can clearly show cause and effect, as well as the transfer of learned material to behavior and behavior to results, is a key factor in this unclear picture. Nevertheless, given that CRM training is one of a number of factors that may influence the practice and effectiveness of CRM behaviors, it may be argued that, although imperfect, the current evidence for the effectiveness of CRM training programs is impressive. Specifically, what can be said is that CRM (generally) produces: (1) positive reactions, (2) enhanced learning, primarily as measured through attitude change, although other learning criteria are also used (e.g., knowledge tests, knowledge structures), and (3) desired behavioral change in the cockpit (simulated or real). However, what can not be answered with certainty is whether CRM training has an effect on the bottom line -- in this case aviation safety. At this point, we believe the tools to determine this are there; what we need are the resources and a mandate to make it happen. WHERE DO WE GO FROM HERE? In terms of evaluation, the current review would seem to suggest two areas where future efforts should be concentrated. First, is that additional evaluation efforts need to begin to assess CRM training at multiple levels, using multiple sources of criteria within each level. Currently, less than half (40%) of the studies published used multiple levels to evaluate training. As noted, analyses of these studies suggest that the use of multilevel evaluations provides a much clearer and diagnostic picture of training efficacy than does single-level evaluations. Furthermore, we found that reaction data were primarily gathered via questionnaires or verbal reports of how well participants liked the course or thought it was worthwhile. Learning criteria were primarily collected through the assessment of attitude change via the CMAQ. Although there have been several other types of learning criteria argued for (structural knowledge, knowledge tests), few studies collected more than one type of learning criteria within the same evaluation study. In most of these studies, attitude change was the only criterion collected. Similar arguments can be made with regard to the collection of behavioral data and data examining organizational impact. Utilization of multiple methods to assess the same type of criteria (e.g., learning) increases confidence in the results obtained. The importance of longitudinal evaluation efforts that use multiple measures and methodologies have been made (see Helmreich &

14 Wilhelm, 1987; Helmreich, Wilhelm, Gregorich, & Chidester, 1990) and need to be conducted. The second area that would strengthen the evaluation effort(s) is to ensure that the evaluation and dissemination of results regarding CRM training provide diagnostic information. Although multi-level evaluation is one piece of providing diagnostic information, there are at least three others. Specifically, many of the evaluation efforts reviewed for the current paper have been written up in such a way as to make it difficult to assess the degree to which tools used to collect evaluation data were theoretically driven or possessed acceptable psychometric properties (e.g., reliability, validity). [This raises an issue that I don't recall seeing anywhere in your earlier discussions. It might be nice to have a foreshadowing of this earlier on in the paper.] Related to this need is Helmreich and Wilhelm’s (1987) plea for more guidance for the aviation community in how to implement and evaluate CRM programs. The ambiguity that exists in regards to the content of CRM training, implementation, and evaluation methods, combined with the lack of explicit descriptions as to the content contained within evaluated CRM programs makes it hard to compare findings across studies or provide general guidance. Finally, the predominant number of studies that we reviewed provided descriptive data only or generalized anecdotal evidence. For example, in relation to reaction data several researchers report that -- participants generally found training useful. This type of information is very subjective in interpretation. Specifically, there is no type of relational meaning, nor does the reader know how “useful” is defined (i.e., on a scale, are there anchor points, totally open responses). Although descriptive data are obviously better than generalized anecdotal evidence, it still makes it hard to determine the “effect” of CRM training as neither significance nor effect sizes can be determined. Future studies should attempt to report data at higher levels than merely descriptive information so that better conclusions can be drawn. Exploration and Expansion As a field in general, CRM training and the corresponding evaluation efforts are moving in two general directions. First, as the aviation community continues to invest money into CRM training, other communities are taking note and beginning to implement similar programs (Salas et al., 2001). The emerging extension of CRM training to other domains further drives the need for multi-level evaluation efforts so the question as to the effectiveness of CRM training can be answered once and for all. A second area that has begun to be investigated (Merrit & Helmreich, 1995b, Helmreich & Merritt, 1998; Maurino, 1999; Chidester et al., 1991; Helmreich, 1997), but must be further examined, is the identification of variables that may moderate or mediate the relationship between CRM training and performance. Understanding how such factors as culture (e.g., national, professional, safety, and organizational), personality, and organizational climate impact the message delivered in CRM training will help in the design, delivery, and evaluation phases of CRM. This in turn would serve to allow the aviation community, as well as other areas of industry, to get the “most bang for their buck”. CONCLUDING REMARKS After reviewing the 58 identified evaluations of CRM training programs within the aviation community the following can be said. First, CRM training programs seem to produce positive participant reactions, learning, and application of learned behavior via

15 simulators or on-line/on the job. However, the final word on whether CRM has an impact on safety remains to be seen. Second, although the aviation community should be commended in that multi-level evaluations are becoming more common, evaluation needs to become a systematically accepted cost of business. Third, the review raised a few methodological concerns in that, many times, descriptions of the components of CRM training or the methodology used to develop and test evaluation measures were not very clear within the literature, making it hard to determine the reliability, validity, and transferability of reported results. At this point in time it is unclear as to whether this is a concern with pure methodology or a combination of methodology and procedures used to disseminate findings. Finally, the current review illustrated that although there are still some rough spots in terms of evaluating implemented CRM training programs, the picture is not as bleak as some opponents would make it out to be – trends seem to indicate that CRM training does have an impact on multiple aspects of the individuals and crews completing the program. However, more and better evaluations are needed. And the aviation community should demand it. We believe that time and continued systematic evaluations will reveal its long term impact—improved safety in the skies. ACKNOWLEDGEMENTS We would like to thank Eleana Edens, Deborah A. Boehm-Davis, Janis A. Cannon-Bowers, and two anonymous reviewers for their thoughtful comments on earlier drafts. This work was performed under the auspices of the UCF/FAA/NAWCTSD Partnership for Aviation Team Training.

16 REFERENCES Alkov, R. A. (1991). U. S. Navy aircrew coordination training –a progress report. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 368-371). OH: The Ohio State University. Alkov, R. A., & Gaynor, J. A. (1991). Attitude changes in Navy/Marine flight instructors following an aircrew coordination training course. The International Journal of Aviation Psychology, 1(3), 245-253. Alliger, G. M., Tannenbaum, S. I., Bennett, W., Jr., & Traver, H. (1997). A metaanalysis of the relations among training criteria. Personnel Psychology, 50(2), 341-358. Alliger, G. M., & Janak, E. A. (1989). Kirkpatrick’s levels of training criteria: Thirty years later. Personnel Psychology, 42, 331-342. Arnold, R. L., & Jackson, D. L. (1985). Recurrent cockpit resource management training at United Airlines. In R. S. Jensen & J. Adrion (Eds.), Proceedings of the 3rd Symposium on Aviation Psychology (pp.345-351). OH: The Ohio State University. Baker, D. P., Bauman, M., & Zalesny, M. D. (1991). Development of aircrew coordination exercises to facilitate transfer. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.314-319). OH: The Ohio State University. Baker, D., Prince, C., Shrestha, L., Oser, R., & Salas, E. (1993). Aviation computer games for crew resource management training. The International Journal of Aviation Psychology, 3(2), 143-156. Barker, J. M., Clothier, C., Woody, J. R., McKinney, E. H., & Brown, J. L. (1996, January). Crew resource management: A simulator study comparing fixed versus formed aircrews. Aviation, Space, and Environmental Medicine, 67(1), 3-7. Besco, R. O. (1998). Crew resource management training: What to teach and how to teach it! Unpublished manuscript. Besco, R. O. (1997). The need for operational validation of human relations-centered CRM training assumptions. In R. S. Jensen and L. A. Rakovan, Proceedings of the 9th International Symposium on Aviation Psychology (pp. 536-540). OH: The Ohio State University. Besco, R. O. (1995). The potential contributions and scientific responsibilities of aviation psychologists. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, Volume 2, 141-148. England: Avebury Aviation. Bowers, C. A., Salas, E., Prince, C., & Brannick, M. (1992). Games teams play: A method for investigating team coordination and performance. Behavior Research Methods, Instruments, and Computers, 24, 503-506. Brannick, M. T., Prince, A., Prince, C., & Salas, E. (1995). The measurement of team process. Human Factors, 37(3), 641-651. Brannick, M., T., Salas, E., & Prince, C. (Eds.) (1997). Team performance assessment and measurement: Theory, methods, and applications. Mahwah, NJ: LEA.

17 Butler, R. E. (1993). LOFT: Full mission simulation as crew resource management training. In E. L. Wiener, B. G., Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 231-259). CA: Academic Press. Butler, R. E. (1991). Lessons from cross-fleet/cross-airline observations: Evaluating the impact of CRM/LOFT training (pp. 326-331). In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.326-331). OH: The Ohio State University. Byrnes, R. E., & Black, R. (1993). Developing and implementing CRM programs: The Delta experience. In E. L. Wiener, B. G., Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 421-443). CA: Academic Press. Cannon-Bowers, J. A., Prince, C., Salas, E., Owens, J., Morgan, B., Jr., & Gonos, G. (1989). Determining aircrew coordination training effectiveness. Paper presented at the 11th Interservice/Industry Training Systems Conference, Fort Worth, TX. Chidester, T. R., Helmreich, R. L., Gregorich, S. E., & Geis, C. E. (1991). Pilot personality and crew coordination: Implications for training and selection. The International Journal of Aviation Psychology, 1(1), 25-44. Chute, R. D. & Wiener, E. L. (1995). Cockpit-cabin communication: I. A tale of two cultures. The International Journal of Aviation Psychology, 5(3), 258-276. Clark, R. E., Nielsen, R. A., & Wood, R. L. (1991). The interactive effects of cockpit resource management, domestic stress, and information processing in commercial aviation. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.776-781). OH: The Ohio State University. Clothier, C. C. (1991). Behavioral interactions across various aircraft types: Results of systematic observations of line operations and simulations. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 332-337). OH: Ohio State University. Connolly, T. J. & Blackwell, B. B. (1987). A simulator approach to training in aeronautical decision making. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium on Aviation Psychology (pp. 251-258). OH: The Ohio State University. Diehl, A. (1991). The effectiveness of training programs for preventing aircrew “error”. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 640-655). OH: The Ohio State University. Flin, R. (1995). Crew resource management for teams in the offshore oil industry. Journal of European Industrial Training, 19(9), 23-27. Fonne, V. M., & Fredriksen, O. K., Capt (1995). Resource management and crew training for HSV-navigators. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 585590). OH: The Ohio State University. Fowlkes, J. E., Lane, N. E., Salas, E., Franz, T., & Oser, R. (1994). Improving the measurement of team performance: The TARGETs methodology. Military Psychology, 6, 47-61. Fowlkes, J. E., Lane, N. E., Salas, E., Oser, R. L., & Prince, C. (1992). TARGETs for aircrew coordination training. Proceedings of the 14th Interservice/Industry Training Systems and Education Conference (pp. 344-352).

18 Freeman, C., & Simmon, D.A. (1991). Taxonomy of crew resource management: information processing domain. In R. S. Jensen (Ed.), Proceedings of 6th Annual International Symposium on Aviation Psychology (pp. 391-397). OH: The Ohio State University. Geis, C. E. (1987). Changing attitudes through training: A formal evaluation of training effectiveness. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium Aviation Psychology (pp. 392-398). OH: The Ohio State University. Goldsmith, T., & Kraiger, K. (1997). Structural knowledge assessment and training evaluation. In J. Ford, S. Kozlowski, K. Kraiger, E. Salas, & M. Teachout (Eds.), Improving training effectiveness in work organizations (pp. 19-46). New Jersey: Lawrence Erlbaum. Goldstein, I. L. (1993). Training in organizations: Needs assessment, development, and evaluation (3rd ed). Monterey, CA: Brooks/Cole Publishing Company. Grau, J. Y., & Valot, C. (1997). Evolvement of crew attitudes in military airlift operations after CRM course. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 556-561). OH: The Ohio State University. Gregorich, S. E. (1993). The dynamics of CRM attitude change: Attitude stability. In Proceedings of the 7th International Symposium on Aviation Psychology (pp. 509-512). OH: The Ohio State University. Gregorich, S. E., Helmreich, R. L., & Wilhelm, J. A. (1990). Structure of cockpit management attitudes. Journal of Applied Psychology, 75(6), 682-690. Gregorich, S. E., & Wilhelm, J. A. (1993). Crew resource management training assessment. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 173-198). CA: Academic Press. Grubb, G., Morey, J. C., & Simon, R. (1999). Applications of the theory of reasoned action model of attitude assessment in the air force CRM program. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.) (1999), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 298-301). OH: The Ohio State University. Halliday, J. T., Maj., Biegalski, C. S., Lt Col., & Inzana, A., Maj. (1987). CRM training in the 349th military airlift wing. In H. W. Orlady & H. C. Foushee (Eds.), Proceedings of the NASA/MAC workshop on conference resource management (NASA Conference Publication No. 2455), pp. 148-158. Moffett Field, CA: NASA-Ames Research Center. Hansberger, J. T., Holt, R. W., & Boehm-Davis, D. (1999). Instructor/evaluator evaluations of ACRM effectiveness. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 79-284). OH: The Ohio State University. Hayward, B. & Alston, N. (1991). Team building following a pilot labour dispute: Extending the CRM envelope. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.377-383). OH: The Ohio State University. Helmreich, R. L. (1991). Strategies for the study of flightcrew behavior. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.338-343). OH: The Ohio State University.

19 Helmreich, R. L. (1984). Cockpit management attitudes. Human Factors, 26, 583-589. Helmreich, R. L., Chidester, T. R., Foushee, H. C., Gregorich, S., & Wilhelm, J. A. (1990, May). How effective is cockpit resource management training? Exploring issues in evaluating the impact of programs to enhance crew coordination. Flight Safety Digest, 1-17. Helmreich, R.L., & Foushee, H.C. (1993). Why crew resource management? Empirical and theoretical bases of human factors in aviation. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 3-45). CA: Academic Press. Helmreich, R. L., & Merritt, A. C. (1998). Culture at work in aviation and medicine: National, organizational, and professional influences. Aldershot: Ashgate. Helmreich, R. L., Merritt, A. C., Wilhelm, J. A. (1999). The evolution of crew resource management training in commercial aviation. The International Journal of Aviation Psychology, 9(1), 19-32. Helmreich, R. L., & Wilhelm, J. A. (1991). Outcomes of crew resource management training. The International Journal of Aviation Psychology, 1(4), 287-300. Helmreich, R. L., & Wilhelm, J. A. (1987). Evaluating cockpit resource management training. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium on Aviation Psychology (pp. 440-446). OH: The Ohio State University. Helmreich, R. L., Wilhelm, J. A., Gregorich, S. E., & Chidester, T. R. (1990). Preliminary results from the evaluation of cockpit resource management training: Performance ratings of flightcrews. Aviation, Space, and Environmental Medicine, 61(6), 586-589. Helmreich, R.L., Wilhelm, J.A., Klinect, J.R., & Merritt, A.C. (in press). Culture, error and Crew Resource Management. In E. Salas, C.A. Bowers, & E. Edens (Eds.), Applying resource management in organizations: A guide for professionals. Hillsdale, NJ: Erlbaum. Hofstede, G. (1988). McGregor in southeast Asia. In D. Sinha, H. Kao, Sr. (Eds.), Social values and development: Asian perspectives (pp. 304-314). Thousand Oaks, CA: Sage Publications. Holt, R. W., Boehm-Davis, D. A., & Hansberger, J. T. (1999). Evaluating the effectiveness of ACRM using LOE and line-check data. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 273-278). OH: The Ohio State University. Hormann, H. J., Goeters, K. M., Maschke, P., & Schiewe, A. (1995). Implementation and initial evaluation of DLR/LH CRM-training. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 591-596). OH: The Ohio State University. Howard, S., Gaba, D., Fish, K., Yang, G., & Sarnquist, F. (1992). Anesthesia crisis resource management training: Teaching anethesiologists to handle critical incidents. Aviation, Space, and Environmental Medicine, 63, 763-770. Ikomi, P. A., Boehm-Davis, D. A., Holt, R. W., & Incalcaterra, K. A. (1999). Jump seat observations of advanced crew resource management (ACRM) effectiveness. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th

20 International Symposium on Aviation Psychology (pp. 292-297). OH: The Ohio State University. Incalcaterra, K. A., & Holt, R. W. (1999). Pilot evaluations of ACRM programs. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 285-291). OH: The Ohio State University. Irwin, C. M. (1991). The impact of initial and recurrent cockpit resource management training on attitudes. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 344-349). OH: The Ohio State University. Jackson, D. L. (1983). United Airlines’ cockpit resource management training. In R.S. Jensen (Ed.), Proceedings of the 2nd Symposium on Aviation Psychology (pp. 131137). OH: The Ohio State University. Jentsch, F., & Bowers, C. A. (1998). Evidence for the validity of PC-Based simulations in studying aircrew coordination. The International Journal of Aviation Psychology, 8(3), 243-260. Jentsch, F., Bowers, C. A., & Holmes (1995). The acquisition and decay of aircrew coordination skills. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 1063-1068). OH: The Ohio State University. Johnston, J. H., Smith-Jentsch, K. A., & Cannon-Bowers, J. A. (1997). Performance measurement tools for enhancing team decision-making training. In M. T. Brannick, E. Salas, & C. Prince (Eds.), Team performance assessment and measurement: Theory, methods, and applications (pp. 311-327). NJ: Lawrence Erlbaum. Kayten, P. J. (1993). The accident investigator’s perspective. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 283-314). CA: Academic Press. Kirkpatrick, D. L. (1976). Evaluation of training. In R. L. Craig (Ed.), Training and development handbook: A guide to human resources development (pp.18.118.27). New York, NY: McGraw-Hill. Komich, J. (1997). CRM training: Which crossroads to take now? In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 541-546). OH: The Ohio State University. Kraiger, K., Ford, J. K., & Salas, E. (1993). Application of cognitive, skill-based, and affective theories of learning outcomes to new methods of training evaluation. Journal of Applied Psychology, 78(2), 311-328. Kraiger, K., Salas, E., & Cannon-Bowers, J. A. (1995). Measuring knowledge organization as a method for assessing learning during training. Human Performance, 37, 804-816. Lassiter, D. L., Vaughn, J. S., Smaltz, V. E., Morgan, B. B., Jr. & Salas, E. (1990). A comparison of two types of training interventions on team communication performance. Human Proceedings of the Factors Society 34th Annual Meeting, 2, 1372-1376. Leedom, D. K., & Simon, R. (1995). Improving team coordination: A case for behavioral-based training. Military Psychology, 7, 109-122.

21 Margerison, C., Davies, R., & McCann, D. (1987). High-flying management development. Training and Development Journal, 41 (2), 38-41. Maschke, P., Goeters, K. M., Hormann, H. J., & Schiewe, A. (1995). The development of the DLR/Lufthansa crew resource management training program. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, 2, 23-31. England: Avebury Aviation. Maurino, D. E. (1999). Safety prejudices, training practices, and CRM: A mid-point perspective. International Journal of Aviation Psychology, 9, 413-427. Maurino, D. E. (1994). Cross-cultural perspectives in human factors training: Lessons from the ICAO Human Factors Program. The International Journal of Aviation Psychology, 4, 173-181. Merritt, A. C., & Helmreich, R. L. (1995a). CRM in 1995: Where to from here? In B. J. Hayward, & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge. Proceedings of the Third Australian Aviation Psychology Symposium (pp. 111-126). Aldershot: Avebury Aviation. Merritt, A. C., & Helmreich, R. L. (1995b). Culture in the cockpit: A multi-airline study of pilot attitudes and values. Proceedings of the 8th International Symposium on Aviation Psychology (pp.676-681). OH: The Ohio State University. Morey, J. C., Grubb, G., & Simon, R. (1997). Towards a new measurement approach for cockpit resource management attitudes. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 478-483). OH: The Ohio State University. Mudge, R. W. (1983). Cockpit management training for the professional pilot. In R. S. Jensen (Ed.), Proceedings of the 2nd Symposium on Aviation Psychology (pp. 165172). OH: The Ohio State University. Naef, W., Cpt. (1995). Practical application of CRM concepts: Swissair’s human aspects development program (HAD). In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 597602). OH: The Ohio State University. Nullmeyer, R. T., & Spiker, V. A. (under review). The importance of crew resource management in MC-130P mission performance: Implications for training effectiveness evaluation. Military Psychology. Orlady, H. W., & Foushee, H. C. (1987). Cockpit resource management training, Technical Report Number NASA CP-2455 Moffett Field, CA: NASA Ames Research Center. Predmore, S. C. (1991). Microcoding communication in accident investigation: Crew coordination in the United 811 and United 232. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 350-355). OH: The Ohio State University. Prince, C., Brannick, M., Prince, C., & Salas, E. (1997). The measurement of team process behaviors in the cockpit: Lessons learned. In M. T. Brannick, E. Salas, & C. Prince (Eds.), Team performance assessment and measurement: Theory, methods, and applications (pp. 289-310). Mahwah, NJ: LEA.

22 Prince, C., Oser, R., Salas, E., and Woodruff, W. (1993). Increasing hits and reducing misses in CRM/LOS scenarios: Guidelines for simulator scenario development. International Journal of Aviation, 3(1), 69-82. Prince, C. & Salas, E. (1999). Team processes and their training in aviation. In D. Garland, J. Wise, & D. Hopkins (Eds.), Handbook of aviation human factors (pp. 193-213). Mahwah, NJ: LEA. Robertson, M. M. & Taylor, J. C. (1995). Team training in aviation maintenance settings: A systematic evaluation. In B. J. Hayward & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge. Proceedings of the Third Australian Aviation Psychology Symposium (pp. 373-383). Ashgate: Avebury Aviation. Robertson, M. M., Taylor, J. C., Stelly, J. W., & Wagner, R. (1995). A systematic training evaluation model applied to measure the effectiveness of an aviation maintenance team training program. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 631636). OH: The Ohio State University. Rollins, M. L. (1995). A descriptive study of crew resource management attitude change. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, 2, 45-50. England: Avebury Aviation. Salas, E., Bowers, C.A., & Edens, E. (Eds.) Improving teamwork in organizations: Applications of resource management training. Hillsdale, NJ: LEA, Inc. Salas, E., & Cannon-Bowers, J. A. (2001). The science of training: A decade of progress. Annual Review of Psychology, 52, 471-499. Salas, E., Fowlkes, J. E., Stout, R. J., Milanovich, D. M., & Prince, C. (1999). Does CRM training improve teamwork skills in the cockpit? Two evaluation studies. Human Factors, 41(2), 326-343. Salas, E., Prince, C., Bowers, C., Stout, R., Oser, R. L., & Cannon-Bowers, J. A. (1999). A methodology for enhancing crew resource management training. Human Factors, 41(1), 161-172. Salas, E., Rhodenizer, L., & Bowers, C. A. (2000). The design and delivery of CRM training: Exploiting available resources. Human Factors, 42(3), 490-511. Schiewe, A. (1995). On the acceptance of CRM-methods by pilots: Results of a cluster analysis. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 540-545). OH: The Ohio State University. Silverman, D. R., Spiker, V. A., Tourville, S. J., & Nullmeyer, R. T. (1997). Team coordination and performance during combat mission training. Paper presented at the Interservice/Industry Training, Simulation, and Education Conference, Orlando, FL. Simmon, D. A., Capt (Ret.) (1997). How to fix CRM. In R. S. Jensen & L. A. Rakovan, Proceedings of the 9th International Symposium on Aviation Psychology (pp. 550553). OH: The Ohio State University. Simpson, P., & Wiggins, M. (1995). Human factor attitudes. In B. J. Hayward & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge.

23 Proceedings of the Third Australian Aviation Psychology Symposium (pp. 185192). Ashgate: Avebury Aviation. Smith, G. M. (1994). Active learning strategies in undergraduate CRM flight training. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology (pp.17-22). England: Avebury Aviation. Spiker, V. A., Nullmeyer, R. T., Tourville, S. J, & Silverman, D. R. (1998, July). Combat mission training research at the 58th special operations wing: A summary (iii-52). In USAF AMRL Technical Report (Brooks), July 1998, AL-HR-TR-1997-0182. Spiker, V. A., Silverman, D. R., Tourville, S. J., & Nullmeyer, R. T. (1998). Tactical resource management effects on combat mission training performance (iii-90). In USAF Technical Report (Brooks), July 1998, AL-HR-TR-1997-0137. Stout, R. J., Salas, E., & Fowlkes, J. E. (1997). Enhancing teamwork in complex environments through team training. Group dynamics: Theory, research, and practice, 1(2), 169-182. Stout, R. J., Salas, E., & Fowlkes, J. E. (1996). The efficacy of enhancing team performance in complex environments. In E. Salas and R. J. Stout (Co-Chairs), The science and practice of enhancing teamwork in organizations. Symposium conducted at the 104th annual meeting of the American Psychological Association, Toronto, Canada. Stout, R. J., Salas, E., & Kraiger, K. (1997). The role of trainee knowledge structures in aviation team environments. The International Journal of Aviation Psychology, 7, 235-250. Taggart, W. R. (1994). Crew resource management: Achieving enhanced flight. In N. Johnston, N. McDonald, & R. Fuller (Eds.), Aviation psychology in practice. England: Avebury. United States General Accounting Office (1997). Human Factors: FAA’s guidance and oversight of pilot crew resource management training can be improved. (GAO/RCED-98-7). Washington, DC: GAO Report to Congressional Requesters. Vandemark, M. J. (1991). Should flight attendants be included in CRM training? A discussion of a major air carrier’s approach to total crew training. The International Journal of Aviation Psychology, 1(1), 87-94. Wiener, E. L., Kanki, B. G., & Helmreich, R. L. (Eds.) (1993). Cockpit resource management. CA: Academic Press. Wilhelm, J. (1991). Crew member and instructor evaluations of line orientated flight training (pp. 362-367). In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 362-367). OH: The Ohio State University. Yamamori, H., & Mito, T. (1993). Keeping CRM is keeping the flight safe. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 399-420). CA: Academic Press. Young, J. P. (1995). Using group dynamics to reinforce CRM concepts in a collegiate course. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 1189-1191). OH: The Ohio State University.

24

Table 1 Summary of CRM Evaluative Efforts Source

Community

Baker, Bauman, & Zalesny (1991)

41 CH-46 pilots

CRM training content

Type of Study/ Data collection

Reactions

Quasiexperimental

- Review of means indicated that Pre-Flight Brief exercise was a worthwhile addition to ACT course and likely to have an impact on next briefing experience (approx 4.4 on 5 pt scale) - Able to cite specific ways they planned to use information gained in both exercises - Open ended questionnaire revealed favorable impressions of both exercises - 75% ranked role play assertiveness exercise as 1st or 2nd choice (n=4) - 90% of aviators agreed that tabletop system could be used for CRM skills training - Most felt good way of learning - Most agreed system demonstrated importance of CRM

Reaction Studies - Pre-Flight Brief - Assertiveness

Self-report survey

Baker, Prince, Shrestha, Oser, & Salas (1993)

Chute & Wiener (1995)

112 male military aviators

Survey conducted at 2 airlines

- No specification of skills taught - Acceptability of tabletop training system to augment CRM training

Quasiexperimental

- No specification of skills taught

Quasiexperimental Self-report survey

1 had joint CRM program for 1 yr (cockpit/cabin crew)

Clark, Nielsen, & Wood (1991)

1 CRM for pilots 135 commercial airline pilots

Self-report survey

- No specification of exact skills taught (mention

Quasi-

- Good program that should be reinstated (joint) - Good reactions, suggest extend to cabin crew (other)

- Felt CRM enhanced information processing

Learning

Findings Behavior

Results/ Organizational Impact

25

Maschke, Goeters, Hormann, & Schiewe (1995)

Cockpit crews; DLR/Lufthansa

Hormann, Goeters, Maschke, & Schiewe (1995)

750 participants in Lufthansa cockpit crews

importance of stress management, communication, interpersonal skills)

experimental

- Judgement/ decision making - Communication - Leadership/teamwork

Quasiexperimental

- Judgment/ decision making - Communication - Leadership/teamwork

Self-report survey

Self-report survey

Quasiexperimental

- Preliminary data indicate that 90% indicated that the course content was highly relevant - Preliminary data indicate that 84% thought the method of presentation was attractive - Units that were based on case studies or those that used role play in job related scenarios had very positive reactions - Methods based mostly on lecture was not rated favorably - Favorable reactions (Likert scale)

Self-report survey Quasiexperimental

- Positive reaction – class worthwhile

Quasiexperimental Self-report survey

Schiewe (1995)

724 cockpit members

- Comm. - Judgement/DM - Teamwork

Quasiexperimental Self-report survey

Vandermark (1991)

Young (1995)

America West Flight attendants and cockpit crew (n=appx. 1200)

- Those suggested by Hackman (1989) – but no actual specification

42 Junior year flight students at Purdue University

- Communication skills - Stress management - Leadership - Psychological factors - Team building - Crew coordination - Conflict resolutions - Situational awareness - Decision making/ problem solving

ability - Majority indicated that negative effects of aviation stress outweigh positive effects of CRM - 80% gave thought course was useful or extremely useful (as per Likert scale)

Self-report survey

Learning Studies Alkov & Gaynor (1991)

Chidester, Helmreich, Gregorich, & Geis (1991)

58 CRM training instructors

528 aviators from USAF military airlift command

-Two-week instructor training course

Quasiexperimental

-No specification of CRM skills taught - No specification of CRM skills taught

Self-report survey/CMAQ Quasiexperimental Self-report

- Positive shifts in attitudes were detected as a result of instructor training course

- Training produced both positive and negative attitude change via type of personality

26

Gregorich (1993)

1191 participants – major air carrier

- No specification of skills

survey/CMAQ Quasiexperimental

- Initial training produced significant positive attitude change - Initial increase followed by significant reductions in attitude levels between training cycles

Self-report survey

Findings Source

Community

CRM training content

Type of Study/ Data collection

Gregorich, Helmreich, & Wilhelm (1990)

National air carrier (696 participants)

- No specification of skills

Quasiexperimental

Grubb, Morey, & Simon (1999)

Helmreich, Merritt, & Wilhelm (1999)

Irwin (1991)

Air Force flight crews (n=2095) and mission crews (n=564)

- Air Force CRM behaviors (group dynamics, stress awareness, mission planning, risk mgmt behaviors, workload mgmt, comm., situation awareness, human performance behaviors)

Pilots in several organizations surveyed just after completion of CRM course and several years later Major U.S. air carrier

- No specification of skills taught

Self-report survey/Revised CMAQ Quasiexperimental Self-report survey

Quasiexperimental

188 fighter, 198 transport, and 77 bomber Air Force pilots undergoing CRM training

Learning

- Positive attitude change (CMAQ) - Reduction in response variation for 2 scales of CMAQ after training - Results were combined for all platforms – indicate significant positive attitude change as per all 8 behaviors

- Decay in attitudes not immediately apparent, but some decay over time

Self-report survey - No specification of exact content

Quasiexperimental Self-report survey

Morey, Grubb, & Simon (1997)

Reactions

Air Force CRM training: 8 core CRM behaviors

Quasiexperimental Self-report survey

- Significant positive attitude change (CMAQ) - Few examples of boomerang - Found attitudes decline over time - Recurrent training results in positive attitude change - Pre-training attitudes varied by pilot group (transport vs. fighter, bomber) - For all 3 groups pilot attitudes toward CRM significantly improved with training

Behavior

Results/ Organizational Impact

27

Rollins (1995)

Simpson & Wiggins (1995)

3 commercial airlines (2 US, 1 Canadian); 1 US military (n=508)

- No description of specific CRM skills trained - Trained lasted 1-2 days

Approximately 88 general aviation pilots

- No specification of skills taught

Quasiexperimental Self-report survey/CMAQ Quasiexperimental Self-report survey

Yamamori & Mito (1993)

2300 crew members of Japan airline

- Recognize and understand different interpersonal styles and effects on crew interaction in cockpit (inquiry, advocacy, conflict, problem def, critique, problem solving)

Quasiexperimental Self-report survey

- Improvement rate varied across groups with most positive post training attitudes being shown by transport, followed by bomber and fighter pilots - Measurable improvement in positive CRM attitude (CMAQ)

- Pilots that had previously completed human factors course were significantly different in their attitudes than those who hadn’t - Pilots who had a human factors course were more confident in terms of their ability to cope with emergency situations and exhibited a relatively heightened level of selfawareness - Strengthens and crystallizes attitudes toward more effective CRM in cockpit

Behavior Studies Arnold & Jackson (1985)

96 flight crews at United Airlines

Barker, Clothier, Woody, McKinney, & Brown (1996)

17 crews of active duty USAF

Brannick, Prince, Prince, & Salas (1995)

51 military air crews (Navy)

- Recurrent training (topic changes each year) - Decision making - No specification of CRM skills previously taught - Compared fixed and formed crews

Quasiexperimental

- No specification of skills trained - Skills measured (assertiveness, decision making/mission analysis, adaptability,

Quasiexperimental

Rating Quasiexperimental Observation

Observation

- No statistical analysis presented - On a 6 pt scale that evaluates overall CRM performance in LOFT most fall between 3-6 (most 4-5) - No significant differences in CRM behaviors between 2 crews - Average overall rating using LOS checklist was 3.35 (fixed) and 3.38 (formed) - Fixed crews committed more minor errors than formed crews (4.4 vs 2.6), while there were no significant differences in the number of major errors committed via crew formation - On average participants were rated as slightly above average in performing behaviors related to CRM (mean=3.56, 3.67 Scenario A and B)

28

situation awareness, leadership, comm.) Clothier (1991)

Post-hoc

Major domestic airline (3,000 crews; 2000 untrained, 1000 trained)

- Significant improvement seen in LOFT exercises (485 trained outperformed 1625 untrained crews) - Crews flying on the line exhibited improved performance (significant improvement in 12 of 14 areas) - Crew interactions earned higher scores (LOS) as they progressed through initial training to recurrent training

Also look at data across 5 airlines – same results

Findings Source

Community

CRM training content

Type of Study/ Data collection

Connolly & Blackwell (1987)

Aeronautical Science students at Embry Riddle University – (16 exp., 13 control)

- Risk assessment - Decision making - Hazardous thought patterns

Experimental

Helmreich, Wilhelm, Gregorich, & Chidester (1990)

Major airline

- No specification of skills taught

Helmreich & Foushee (1993); Taggart (1994)

Over 2000 line flights & LOFT sessions For 859 crews LOFT involved in flight emer. Major airline

Observation

Quasiexperimental Observational ratings

- No specification of skills taught

Quasiexperimental Observations

Reactions

Learning

Behavior

- Checklist scores and flight ratings indicated exp. group performed significantly better on post-test than control - Compared with control group exp. group witness a significantly greater amount of change on both checklist scores and flt ratings after training

- Global rating of overall crew performance using LINE/LOFT worksheet – behavior in both on line and in LOFT indicates positive changes in CRM performance - Wide variation in specific CRM behaviors used between fleets at same airline - Significant positive shifts in process behavior during line operations across a 3-year period.

Results/ Organizational Impact

29

19 I/E evaluating 2 above fleets; considering all pilots that these I/Es have evaluated over past 6 months

- Communication - Situation assessment - Planning/Decision Making

Quasiexperimental

Jentsch, Bowers, & Holmes (1995)

20 instrument rated pilots from Embry Riddle

- Mission analysis - Situation awareness

Experimental

Nullmeyer & Spiker (under review); Spiker, Nullmeyer, Tourville, & Silverman (1998); Spiker, Silverman, Tourville, & Nullmeyer (1998); Silverman, Spiker, Tourville, & Nullmeyer (1997)

11 Air Force SOC MC-130P aircrews

- Mission planning/debrief - Task management - Situational awareness - Crew coordination - Communication - Risk management - Tactics employment

Nullmeyer & Spiker (under review)

87 students in MC130P

Hansberger, Holt, & BoehmDavis (1999)

Instructor Ratings after the fact

Observation

- Mission planning/debrief - Task management - Situation awareness - Crew coordination - Communication - Risk management - Tactics employment

Quasiexperimental Instructor ratings, Selfreport survey, Observation

Quasiexperimental Instructor comments translated into ratings

- ACRM pilots assessed higher than non ACRM pilots on workload management, comm., and planning - Expected questions from behavioral observation form to factor into workload, comm., and situation awareness; situation awareness factored into planning instead - I/E has positive evaluations for ACRM training especially in area of communication - ACRM strong in facilitating establishment of bottom lines and back up plans, not so strong in reducing distractions or helping crew to be generally organized and composed - Communication ratio decreased for groups receiving training - Crews tended to maintain efficient comm. patterns through all 7 flights - Comm. frequencies did not change significantly after 45 days - Assessed performance differences, none - Team coordination behaviors positively related to mission performance - Crew coordination processes were differentially related to performance across missions - Quality of mission planning related to mission performance - Self report of mission performance and crew coordination positively related

- Based on rating system devised from instructor comments in grade folders found students rated above average in mission preparation and crew coordination - Students rated below average in decision making and comm

30

Predmore (1991)

United Flight 232 (comm. analysis)

- No specification of CRM skills taught at United

Post-hoc Case Study

- Efficient use of resources - Distribution of communication across multiple tasks and members - Maximum utilization of 4th crew member - Explicit prioritizing of tasks - Active involvement of Captain through entire process

Findings Source

Community

CRM training content

Type of Study/ Data collection

Diehl (1991)

Summary paper

- No specification of specific skills

Post-hoc

Kayten (1993)

NTSB reports

- No specification of skills taught within particular incidents

Post-hoc

Alkov (1991)

45 Naval aviation training squadrons (helicopters, attack bombers, multiplaced fighters)

- Pilot judgment - Situation awareness - Decision making - Policy and regulations - Command authority - Workload performance - Use of available resources - Communication skills - Crew dynamics (leadership/followership) - Communication - Decision making - Stress management

Quasiexperimental

- Decision making - Communication - Resolving conflict - Assertiveness - Feedback/criticism - Pilot attitudes

Quasiexperimental

Reactions

Learning

Behavior

Results/ Organizational Impact

-

- Reduction in accident rates cited from 4 environments (USAF, USNavy helicopters, USNavy fighterbomber, Petroleum Helicopter Inc) Several near incidents saved by good CRM practices – cites incidents

- Positive change in attitudes regarding CRM behaviors

- Contributed to better communication between instructors and students

- Overall aircrew mishap rate declined in all three communities

- Positive attitude change as measured by CMAQ - Stability of change over 5 year period

- Flight attendants reported that cockpit crew treated them with more respect after CRM - Flight attendants reported that cockpit crew made them feel more a part of crew and were included in more pre-flight briefings -Effective in helping them to reduce potential for mishaps; consciously apply techniques taught - 135 felt had become safer pilots - Some were able to comment on specific instances where had used

- Quarterly air carrier discrepancy reports significantly decreased

Results Studies

Multi-Level Studies

Byrnes & Black (1993)

Geis (1987)

Delta airlines

838 US Army pilots – 163 completed both pre and post CMAQ (results based on these)

- Squadrons report benefit and wish to continue

Self-report survey

Quasiexperimental Self-report survey

Self-report survey

- 140 felt material applicable to their job

- Paired t-tests indicated significant positive attitude change - At an item level most of the individual items of the CMAQ witnessed positive

31

Grau & Valot (1997)

Halliday, Biegalski, Inzana (1987)

Follow up: 290 questionnaires mailed to random subjects who had participated from the original 838; 3 months later data analyzed based on 142 responses Questionnaire sent to 312 crew members that had participated in CRM training (response of 172); French Air Force

349th Military Airlift Wing

- Crew topics - Communication - Understanding the situation - Confidence/Doubt - Occupational stress - Fatigue - Human error

Quasiexperimental

- Problem solving - Decision-making

Quasiexperimental

Pilots and spouses/ partners

- No specification of specific skills taught

- 95% of trainees see themes as relevant

Self-report survey

Self-report survey, Peer survey

Approximately 250 crew members trained

Hayward & Alston (1991)

change - Training had positive effect regardless of position in cockpit, level of experience, aircraft flown

- Situation awareness - Judgment - Problem solving - Workload mgmt - Stress management - Identification of resources - More….

Quasiexperimental

- 90% response rate to survey; indications are that students developed a highly receptive attitude toward seminar format

- Argue that attitudes toward CRM are also improving

- Most participants responded enthusiastically

- Report that workshops promoted increased awareness of human factors, crew performance, and safety implications - Report increased awareness of potential stressors and way could be dealt with - Without reinforcement flightdeck management attitude’s began to regress to pre-CRM levels - Cite normative cultural differences in attitudes

Self-report survey

Helmreich (1991)

Summary article Based on data collected through NASA/University of Texas Crew Research Project

- No specification of skills taught

Quasiexperimental Self-report survey

- 8,000 surveys from 3 airlines indicate that LOFT is valued by crews as a training technique, considerable variability in quality of scenarios

principles learned (40 comments in all)

- 50% said they frequently changed their cockpit behavior with other crew members - 65% report that the way they analyze situations or their own behavior has changed - 65% reported a difference in the ways that crews operate after CRM training - 80% self reported changes during flight, 53% mentioned changes during preparation phase, 35% reported changes during debriefing phase - 10% report a large impact in terms of changing life at squadron level (51% felt some occasional change) - Of those not trained in CRM who had flown with a person trained in CRM, 75% indicated that CRM trained crew members exhibited recognizable behavioral changes - 80% of untrained individuals felt that they had observed better coordination and flightdeck atmosphere from crew members who had undergone training

32

2 major airlines; independently developed CRM courses

- No specification of exact content

Source

Community

CRM training content

Type of Study/ Data collection

Ikomi, BoehmDavis, Holt, & Incalcaterra (1999);

50 crews eastern US regional airline; 2 fleets (experimental/control fleet)

- Communication - Situation assessment - Planning/Decision Making

Quasiexperimental

Incalcaterra & Holt (1999)

600 active pilots in above airline (184 trained, 84 non ACRM group)

- Communication - Situation assessment - Planning/Decision Making

Quasiexperimental

Holt, BoehmDavis, & Hansberger (1999)

All pilots in ACRM fleet and control fleet

- Communication - Situation assessment - Planning/Decision Making

Quasiexperimental

Jackson (1983)

4059 participants at United Airlines

- Inquiry - Advocacy - Conflict resolution - Critique - Decision making

Quasiexperimental

- Communication - 3 conditions (control,

Experimental

Helmreich & Wilhelm (1991)

Lassiter, Vaughn, Smaltz,

Undergraduates from University of Central

Quasiexperimental Self-report survey

developed - Overall pattern of evaluation was extremely positive - Lots of variability across seminars

toward CRM concepts - Significant positive change on all 3 scales of CMAQ

Reactions

Learning

Findings Behavior

- Crews from ACRM group showed superior performance in 13 of 20 items (5 pt Likert scale – jump seat observations)

Observation

- 93% of pilots voted to expand use of ACRM to other fleets

Self-report survey

- Pilots trained in ACRM showed positive attitudes toward CRM (CMAQ) - Positive attitudes toward ACRM (neutral on wrkld) - Correct knowledge of content and timing of ACRM procedures (2 yrs after training)

Observation

Self-report survey

- Trained fleet performed significantly better overall (observer summary evaluation – jump seat observations)

- 91% felt it had improved their flight performance

- Performance for 9 of 10 behaviors significantly higher for trained group (LOE)

- Descriptive stats only - Participants tend to find experience quite rewarding (avg. response 6.8 on a 9 pt scale) - Over 95% of trained crew members felt CRM training has value to crew and can be applied to operation of their aircraft - LOFT program seems quite successful overall; most participants rate LOFT experience as very good

Results/ Organizational Impact

- 41% shift in selfperceived style of interaction – crew members more aware of own interaction style

- Argues that improvement has been noted in nearly all of the targeted areas during each of four quarters of the first year – sounds like selfreported improvement. - Steady decline in the number of proficiency check failures since advent of LOFT at United

- Attitude change via CMAQ non-significant

- Teams receiving skill based comm. training exhibited significantly

- Trained fleet superior on 6 of 12 line check items

33

Morgan, & Salas (1990)

Florida (n=90)

Leedom & Simon (1995)

32 US Army UH-60 aviators (battlerostered crews)

knowledge based training, skill based training)

- No specification of exact skills

better communication skills than other two conditions (behavioral observation rating scale)

Self-report survey, Observational ratings Quasiexperimental

- Small positive shift of attitudes (AACQ)

- Improved team communication patterns, more efficient mgmt. of crew resources, fewer team errors; team coordination improved (ACE checklist) - Mission performance ratings improved as did mission accomplishment (# completed as well as quality)

- No significant change (already had positive attitude toward CRM)

- Improvement across all 13 dimensions of team coordination - Higher flight proficiency ratings - Higher mission performance

Self-report survey, Observations

30 US Army AH-64 aviators (battlerostered crews)

Margerison, Davies, & McCann (1987)

Mudge (1983)

Naef (1995)

Salas, Fowlkes, Stout, Milanovich, &

Australian airline

- No specification of exact skills

- Decision making - Planning/priority setting - Delegation - Communication - Interpersonal Skills

25 pilots enrolled – some with FAA, corporate pilots, individual pilots, airline executives testing program

- No specification of skills taught

Swissair line pilots and instructors

- Communication - Feedback - Decision making - Judgment - Functioning under pressure

35 pilots & 34 enlisted aircrewmen from Navy transport

Quasiexperimental Self-report survey, Observations Quasiexperimental Self-report survey Quasiexperimental Self-report survey

- Assertiveness - Communication - Situational awareness

- Favorable reactions

Quasiexperimental Self-report survey

Quasiexperimental

- Provides evidence of some learning (aircrew members better understand interpersonal issues; discussions more productive) - Too early to answer if produces positive behavioral changes, but positive indications - The four pilots that have graduated at this point self report that their actions on the flight deck are changing as a result of program - Also have unsolicited comments from several still in program that their own actions are changing - 53% response rate on survey taken 6 months later indicated that 97% of flight crew members reported one or more positive behavior transfer

- No final conclusions as program is ongoing – present trends - No pilot has rated a study unit at less than good – more flight time=higher ratings

- Majority of pilots felt course presentation was above average (approx. 560 out of 640) - Majority of instructor pilots felt the course related to their needs and was worthwhile - 53% response rate 6 months later indicated that 70% of instructors would invest an off duty day for this type of training - Strong endorsement of training usefulness

- Trained group showed more positive attitudes towards use of CRM

- Overall trained teams performed better than untrained teams (TARGETS)

34

helicopter squadron

- Mission analysis

Self-report survey, Multiple choice, Observation

27 aviators from naval helicopter community (15 experimental, 12 control)

- Decision Making - Assertiveness - Mission analysis - Communication - Coordination - Leadership - Adaptability - Situational awareness

Quasiexperimental

Source

Community

CRM training content

Smith (1994)

10 undergraduate flight students

- Self analysis vs. traditional debrief to develop CRM skills

Prince (1999)

- Significant increases in positive attitudes via overall attitude scale (ACAQ) and CMAQ Communication and Coordination subscale - Trained group exhibited higher levels of knowledge regarding CRM principles

- Correctly managed 15% more during prebrief and 9% more during higher workload segment

- Strong endorsement of training - Positive support for CRM training - Reported that skills learned would be implemented for more effective pre and debriefs

- No significant change (restriction of range) (ACAQ) - Trained scored higher on knowledge test than untrained

- Trained teams performed better during preflight brief (TARGETS) - No differences in low workload times (TARGETS) - Trained teams engaged in greater number of teamwork behaviors during high workload segment (TARGETS)

Type of Study/ Data collection

Reactions

Learning

Quasiexperimental

- Self analysis was not valued highly as a training technique - LOFT reported as being more helpful than self analysis technique - Positive reaction to active learning technique - Positive reaction to training

- Self analysis produced little change in attitudes

- Self analysis helped crews to perform significantly better in 3 LOFT sessions, moderately better in 3

- Examples provided of how use new skills on job - Trained group scored higher on knowledge test - Positive attitude change found (CMAQ) - Evidence of learning via changed knowledge structures (n=22) - Attitude change positive, but not significant (n=12) - Knowledge test did not show learning effects (suggest restriction of range)

- Trained teams performed better (TARGETs) as per both real-time instructor ratings and post-hoc ratings of video

Self-report survey, Multiple choice, Observation

Self-report survey, Observation Stout, Salas, & Fowlkes (1996); Stout, Salas, & Fowlkes (1997)

42 student pilots in Navy Advanced Maritime Curriculum 20 experimental 22 control

- Communication - Assertiveness - Situation Awareness

Stout, Salas, & Kraiger (1997); Fowlkes, Lane, Salas, Franz, Oser (1994); Fowlkes, Lane, Salas, Oser, & Prince (1992)

22 aviators, helicopter community

- Communication - Assertiveness

Wilhelm (1991)

8300 crew members from 4 airlines

- Summary article results obtained NASA/UT/LOFT survey - No specification of skills

Subset of this also presented in

Experimental Self-report survey, Multiple choice Experimental

- Positive reaction to training (n=12)

Self-report survey, Observation

Quasiexperimental Self-report

- Crew members value LOFT as a training technique

Findings Behavior

- Trained participants performed an average of 8% more desired behaviors than control (TARGETS, n=12)

- Crews typically think they do a much better than average job during LOFT (self evaluation) - Found low to moderate agreement

Results/ Organizational Impact

35

Butler (1991; 1993)

across airlines

survey, Instructor evaluation (1 organization)

in performance ratings between self evaluation and instructor evaluation (one organization) - When asked about the exhibition of specific CRM behaviors in LOFT large differences found across organizations in CRM behavior patterns, although self reports all tended to be high - Significant, but small relationship between instructor ratings and self ratings of specific CRM behaviors - Over a 2 yr period at one airline CRM behaviors decreased as did ratings of scenario and instructor quality in LOFT - Over a 3 year period at one airline scenario, instructor, and LOFT delivery remained constant in perceived quality and self reports of CRM behavior steadily increased; overall this airline received lower ratings than others’ on several LOFT scales

Biographies Eduardo Salas Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 and University of Central Florida Psychology Department Orlando, FL 32816 Degree/Institution: Ph.D. Industrial/Organizational Psychology, 1984 Old Dominion University, Norfolk, VA C. Shawn Burke Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 Degree/Institution: Ph.D. Industrial/Organizational Psychology, 2000 George Mason University, Fairfax, VA Clint A. Bowers Affiliation: Team Performance Laboratory University of Central Florida Orlando, FL 32826 and University of Central Florida Psychology Department Orlando, FL 32816 Degree/Institution: Ph.D. Clinical and Community Psychology, 1987 University South Florida, Tampa, FL Katherine A. Wilson Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 Degree/Institution: Graduate Student Human Factors and Applied Experimental Doctoral Program University of Central Florida, Orlando, FL