INTELLIGENCE AND JOB PERFORMANCE ...

53 downloads 67382 Views 2MB Size Report
months in high-tech industries). This means that workers ... personality dimensions will do a good job of predicting supervisor evaluations of workers (Mount ...
Psychology, Public Policy, and Law 1996, Vol. 2, No. 3/4,447-472

Copyright 1996 by the American Psychological Association, Inc. 1076-8971/96V$3.00

INTELLIGENCE AND JOB PERFORMANCE: Economic and Social Implications John E. Hunter

Frank L. Schmidt

Michigan State University

University of Iowa

General mental ability (intelligence) is the dominant determinant of the large individual differences in work output on the job revealed by research, but highly visible individual differences in citizenship behavior on the job make the intelligenceperformance relationship harder to observe in everyday life. Over time, the validity of job experience for predicting performance declines, while that of ability remains constant or increases. Path analyses indicate that the major reason ability predicts performance so well is that higher ability individuals learn relevant job knowledge more quickly and learn more of it. The current social policy that strongly discourages use of mental ability in hiring is counterproductive and has produced severe performance decrements. This policy should be changed to encourage the use of ability measures. However, it should also encourage the use of personality measures that increase overall predictive validity while simultaneously reducing differences in minority-majority hiring rates.

General mental ability (intelligence) plays a dominant role in the determination of individual differences in job performance. This statement is no longer controversial among researchers who study job performance. However, both laypeople and psychologists from other areas often find this statement controversial. Many people believe that this statement cannot be true even though it has been confirmed by thousands of empirical studies. There are two reasons why many people find it hard to believe that intelligence is the main determinant of variation in job performance. First, most people greatly underestimate the extent of differences in performance. Thus, they underestimate the importance of job performance in the evaluation of workers. Second, most people confuse two different dimensions of personnel evaluation: (a) performance (i.e., productivity) and (b) citizenship behavior (i.e., social behavior at work). These background problems are so strong that they interfere with the ability of readers to follow discussions of findings on performance. Therefore, we will first discuss citizenship behavior and the word performance. We will then discuss the extent of individual differences in performance. After discussing these two preliminary issues we will return to our main topic: the strong relationship between intelligence and job performance. Citizenship Behavior and Performance When supervisors evaluate workers, they rely heavily on factors other than the worker's productivity. In particular, they give substantial weight to the social behavior of the worker (Orr, Sackett, & Mercer, 1989). However, this is usually not John E. Hunter, Department of Psychology, Michigan State University; Frank L. Schmidt, College of Business, University of Iowa. Correspondence concerning this article should be addressed to Frank L. Schmidt, College of Business, University of Iowa, Iowa City, Iowa 52242. Electronic mail may be sent via Internet to [email protected].

447

448

HUNTER AND SCHMIDT

a conscious and articulated process; rather, it takes place implicitly during the evaluation of the worker that builds up over a period of time. Consider productivity in scientific research. If citizenship behavior and performance are objectively measured, they show little correlation. On the other hand, if subjective evaluation is done by supervisors, then ratings of these two dimensions are very highly correlated. Thus, supervisors do not keep these dimensions clearly separated. We believe that this is also true in the implicit theories of performance developed by laypeople and by scientists who do not professionally study work. This section is devoted to clearly distinguishing between citizenship and performance; we will briefly review what we know about predicting each. Discussions of correlation tend to focus by convention on high values. Thus, the question is usually stated as Why would high intelligence tend to produce high worker evaluation? However, in this case it may be clearer if we focus on the low values. What are the typical performance problems? What are the typical complaints for citizenship problems? When these are listed, it becomes clear that they are distinct from each other. Typical performance problems are these: The worker is slow, prone to errors, has limited ability to follow instructions, and is reluctant to adopt improved work methods. Typical citizenship problems are these: The worker is disorganized, unreliable, argumentative, and untrustworthy. Consider the reports from researchers who interview workers about performance variation. Workers are most aware of their own variability over time. They remember that when they first came on the job, they were slow to get things done, they made many errors, and there were many things that they couldn't do. All of these problems diminish with experience. Supervisors will cite these same three dimensions of low performance: slowness, error proneness, and limitations in capacity. However supervisors add another dimension that workers have trouble seeing: rigidity in adopting new work priorities. At present, product life cycles are much shorter than worker life cycles. Although a worker might prefer to work at the same job for 40 years, most manufactured products now undergo fundamental redesign in 5 to 10 years (6 to 12 months in high-tech industries). This means that workers must abandon old work strategies and learn new strategies that are often incompatible with older strategies. Many workers have difficulty doing this. Citizenship problems are not usually clear to those who have them; indeed, there is often vehement denial and righteous indignation when a worker is confronted with allegations of poor citizenship behavior. However, to observers it may be clear that a worker is disorganized or unreliable (tardy or erratic) or argumentative or untrustworthy. Thus, people are strongly aware of the deficiencies of others but often overlook their own.

Determinants of Citizenship Behavior In the early years of the 20th century, industrial psychologists devoted little study to what teachers call "disruptive behavior." It was regarded as "obvious," and most scientists assumed that social behavior was well understood by both employers and workers and thus needed little scientific study. However, since 1985 there has been considerable study of disruptive behavior on the job (e.g., see Borman, White, & Dorsey, 1995; Motowidlo & Van Scotter, 1994; Ones,

INTELLIGENCE AND JOB PERFORMANCE

449

Viswesvaran, & Schmidt, 1993; Organ, 1990; Organ & Konovsky, 1989; Organ & Ryan, 1995). Disruptive behavior has long been studied by developmental psychologists, and individual differences in disruptive behavior are a key focus in personality theory. The findings show that different disruptive behaviors are predicted by different personality traits. Disorganized, erratic, and unreliable behavior is characteristic of those who are low on the personality trait of conscientiousness. Personality tests of conscientiousness predict disorganized and erratic behavior on the job (Barrick & Mount, 1991; Mount & Barrick, 1995; Sackett & Wanek, 1996; Schmidt, Ones, & Hunter, 1992). People who are argumentative (uncooperative) and who have trouble getting along with others are those who are low on the personality trait of agreeableness. Personality tests of agreeableness predict a worker's social relationships with coworkers and supervisors. People who are untrustworthy (especially people who often lie and steal) are usually low on various personality traits. For example, meta-analysis of studies of juvenile delinquents (Hough, Eaton, Dunnette, Kamp, & McCloy, 1990) have shown that they are not only low on conscientiousness but are also high on negative affect (low impulse control, tense, etc.), low on agreeableness, somewhat introverted, and rigid (low in openness). A good personality test battery that gives appropriate weight to the relevant personality dimensions will do a good job of predicting supervisor evaluations of workers (Mount & Barrick, 1995; Ones, 1993). That test should give high weight to conscientiousness and considerably lower weight to emotional stability and agreeableness. The weights to be given to extraversion and openness are still subject to debate. Many people who do not study work scientifically fail to distinguish between poor citizenship behavior and poor performance. These are not highly correlated. There are people who are poor citizens but who can perform the work well. These poor citizens who can perform well when they perform are the workers that laypeople think of when they say "Intelligence just doesn't matter that much at work." However, there are many workers who are excellent citizens but who cannot do the job very well. These poor performers are usually tolerated by supervisors because they obviously try hard. These individuals also tend to be ignored in lay theories. The unscientific focus on very bad workers is usually a focus on extreme cases: Workers who are fired for individual reasons. In most cases, firing results from what the organization sees as moral reasons—workers who exhibit either serious or repeated poor citizenship. Most organizations respond to poor performance not by firing but by assigning the worker to simple jobs with low task variety (such as janitorial work); jobs in which poor performance does not have a major impact on customers. As a result, laypeople are much more aware of job problems that result from poor citizenship behavior than from poor performance. Individual Differences in Productivity Overview Does it matter whether personnel selection works well or poorly? In our experience, most civil rights lawyers and many other "laymen" assume that the

450

HUNTER AND SCHMIDT

method of selecting workers is of little importance. They feel no guilt in focusing solely on minority hiring and rejection rates as the criterion by which selection methods should be judged. Implicit in these arguments is the hidden belief that there is little difference between workers in terms of output. In particular, there is the implicit belief that highly qualified workers will differ little from workers with low qualifications (i.e., workers with high ability will differ little in output from workers with low ability). This belief is false to fact. Research has shown that there are very large individual differences in productivity, differences at the top as well as differences at the bottom. Good selection can make a vast difference in the average productivity of the workers selected. The differences between good and poor selection are large enough to have major impact on the functioning and effectiveness of the organization (Hunter, 198la). Scientific Findings for Productivity Differences There are very large individual differences in job performance and productivity in all lines of work, as shown by the literature reviews in Schmidt and Hunter (1983) and Hunter, Schmidt, and Judiesch (1990). These reviews have shown that the overall work productivity of an organization is critically dependent on the extent to which the organization selects the most productive workers possible. (See also Schmidt, Hunter, McKenzie, & Muldrow, 1979; Schmidt, Hunter, Outerbridge, & Trattner, 1986.) Productivity can be measured in many ways. To cast productivity in units that can be compared across jobs, a scientific study must use what is called a ratio scale of measurement, a scale for which it is meaningful to compute ratios to compare performance. Hunter et al. (1990) reviewed the studies of productivity differences that were able to use a ratio scale for performance. Consider a given job. If performance is measured using a ratio scale, then it is possible to compare the performance of each worker to the performance of the average worker. For each worker, we can compute the ratio of that worker's productivity to the level of productivity for the average worker. The standard deviation of those ratios measures the extent to which workers on that job differ from one another in productivity. The key advantage of the productivity ratio is that its standard deviation has the same meaning from one job to another. Thus, results can be compiled across studies to create a baseline for scientific generalization. This is what was done by Hunter et al. (1990); their key results are presented in Table 1. Examination of Table 1 shows that there is one special set of jobs that differs from others: sales jobs. Productivity differences in sales are much larger than differences in other jobs at the same level of complexity. If we ignore sales jobs, then differences in productivity are primarily determined by the level of mental complexity of the job. The higher the level of complexity, the larger are individual differences in productivity. Extreme Worker Comparisons Table 2 presents the same facts contained in Table 1 in a way that is intended to be more easily interpreted by readers. The question considered in Table 2 is this: How big is the difference in work

INTELLIGENCE AND JOB PERFORMANCE

451

Table 1 Applicant Output Standard Deviations for Occupational Groups Occupation No. of Studies % SD* Low complexity Routine blue collar 23 18 Routine clerical 18 20 Average 41 19 Medium complexity Crafts 7 29 Decision-making clerical 5 35 Average 12 32 High complexity Professional judgment 7 48 Sales Insurance 15 120 Noninsurance 7 48 Note. Figures presented are averages across studies. From "Individual Differences in Output Variability as a Function of Job Complexity," by J. E. Hunter, F. L. Schmidt and M. K. Judiesch, 1990, Journal of Applied Psychology, 75, p. 35. Copyright 1990 by the American Psychological Association. Adapted with permission of the authors. Standard deviation of output as a percentage of average output on the job. output between top and bottom workers? The problem with this question is that for real distributions, there is no top worker and no bottom worker. The distribution of performance is the "normal" distribution, which has no exact top and no exact bottom. No matter how good your best worker has been, you only have to wait a long enough period of time and you will find a worker who is better yet. Alas, the same is true at the bottom. No matter how bad your worst worker has been, it is only a matter of time before someone still worse shows up. The solution to this problem is to create a quantitative definition for the words top and bottom. This definition is necessarily arbitrary, but other definitions can be shown to generate very similar results. The definition used in Table 2 is to define the top workers as those in the top 1% in performance. The bottom workers are defined as those in the bottom 1% in performance. Table 2 Performance Differences Between Extreme Groups of Applicants Group Performance difference Top 1% vs. bottom 1% 153:47 = 3.23 188:12 = 15.58 Not normal33 Not normal 3 Not normal

Low complexity Medium complexity High complexity Insurance sales Other sales Top 1% vs. average

153:100= 1.53 Low complexity Medium complexity 188:100 = 1.88 Not normal33 High complexity Not normal3 Insurance sales Not normal Other sales 3 Ratios cannot be estimated because output is not normally distributed.

452

HUNTER AND SCHMIDT

The top part of Table 2 compares top and bottom workers. That comparison shows very large differences between workers. Consider the smallest comparison: that for low-complexity work (the bottom 20% of the workforce). Whereas top workers produce 53% more than average, bottom workers produce 53% less than average. If we compare the top workers to the bottom workers, the ratio of productivity is 153:47, or more than 3 to 1. For medium complexity work (the middle 63% of the workforce), the difference is larger. Top workers are 88% more productive than average, whereas bottom workers are 88% less productive. If we compare the top workers with the bottom workers, the ratio of productivity is 188:12, or more than 15 to 1. Most people do not think quantitatively; they think in terms of binary distinctions such as competent or incompetent. This language makes it easy to consider very poor performance for bottom workers but makes it hard to consider very good performance for top workers. Most lawyers we have dealt with in selection cases are not surprised to find that bottom workers are much worse than average. However, they are surprised to find that top workers are so much more productive than average workers. The bottom part of Table 2 is designed to bring this point home. For low-complexity work, top workers are 53% more productive than average workers. For medium-complexity work, the difference is larger: Top workers are 88% more productive than average workers.

Qualitative Aspects of Productivity Some kinds of work have qualitative as well as quantitative aspects. This is very much true of police work and firefighting, because of the public safety implications of this work. Consider crime investigation. A typical problem is serial rape. An officer responds to a report of rape. The officer interviews the victim and files a report. These reports are used by detectives to track down the rapist. A purely quantitative analysis of police work would focus on the efficiency of the interview and the report-writing process. A high-performing officer (one in the top 1% in performance) will perform both tasks 88% more efficiently than a randomly selected officer. So if the police department uses randomly selected officers, the department will need 88% more officers to handle the same number of reports. However, there is a qualitative difference in the work of good and poor interviewers. Suppose that for an average police officer, the probability that the interview and report generate a useful clue is 10%. For an optimally selected police officer, the probability of a useful clue will be, on average, 88% higher (i.e., 19% rather than 10%). This difference in quality translates into a large difference in the number of reports needed to catch the rapist. Probability of successful clue 10% 19%

Number of reports needed to catch rapist 10 5

That is, for the high quality work, a rapist will be caught after an average of 5

INTELLIGENCE AND JOB PERFORMANCE

453

victims. For average quality work, the rapist will not be caught until 10 women have been victimized and interviewed. The police department could compensate for the quantitative implications of poorer work by hiring twice as many officers to handle complaints. However, the community would still experience twice as many rapes, and that impact would not be counted in the quantitative assessment.

Basic Validity Findings

Comprehensive Reviews This section presents a brief overview of the scientific findings on the importance of ability in the determination of individual differences in job performance. The findings of thousands of studies show that individual differences in ability are important determinants of the level of job performance. With the exceptions noted below, other predictors are either much weaker or are relevant to the prediction of citizenship behavior rather than performance. There have been many comprehensive reviews of validation studies. Early reviews were both reviewed and quantified by Hunter and Hunter (1984), including Ghiselli (1966, 1973), Dunnette (1972), Reilly and Chao (1982), and Vineberg and Joyner (1982). Hunter and Hunter also added a number of new meta-analyses. There was a subsequent review of published research by Schmitt, Gooding, Noe, and Kirsch (1984), though they made no corrections for unreliability or restriction in range. Validity estimates are biased unless proper corrections for error of measurement and range restriction are made (Hartigan & Wigdor, 1989). Hunter and Hunter made these corrections, whereas Schmitt et al. did not. The proper corrections to the Schmitt et al. findings were made by Hunter and Hirsh (1987). Once those corrections were made, there were essentially no differences between the findings of Schmitt et al. and the findings of Hunter and Hunter. A number of reviews have been done since Hunter and Hunter (1984). Hunter and Hunter was updated by Hunter and Hirsh (1987) and still more recently by Schmidt and Hunter (in press). Other reviews have been somewhat more limited than these because they are devoted to only a portion of the job spectrum. Pearlman, Schmidt, and Hunter (1980) conducted a meta-analysis of the prediction of clerical performance. Schmidt, Hunter, and Caplan (198la, 1981b) reviewed studies predicting performance in crafts jobs in the petroleum industry. Northrup (1986) reviewed the prediction of crafts and trades jobs. Trattner (1988) reviewed the prediction of a wide variety of professional jobs. There have also been meta-analyses for specific jobs such as the review of predictors of job performance for police (Hirsh, Northrup, & Schmidt, 1986) and for firefighters (Hunter, 1990). The Definition of Ability The definition of ability differs from one scientific study to another. The ability most studied is general cognitive ability, which measures the ability to learn (Hunter, 1986). Psychologists often call this ability intelligence, although psychologists use this word somewhat differently from nonpsychologists. Psychologists do not mean genetic potential, but rather the level of ability developed at the time that

454

HUNTER AND SCHMIDT

an ability test is taken. Because studies show that there is usually little change in ability during the adult years, industrial psychologists use the word intelligence for the level of ability developed when a person reaches his or her adult years. General cognitive ability is not the only ability relevant to work and learning on the job. There are other important abilities that are relevant to many jobs: psychomotor ability, social ability, and physical ability. However, the validity for these abilities is much more variable across jobs and has a much lower average validity across jobs. Review of these other abilities is beyond the scope of this article. Aptitudes are abilities that are narrower than general mental ability. Verbal ability, mechanical ability, and numerical ability are examples of aptitudes. Until recently, it was widely believed that job performance could be much more successfully predicted using a variety of aptitudes than using general mental ability alone. Multiple aptitude theory hypothesized that different jobs required different aptitude profiles and that regression equations containing different aptitudes for different jobs would therefore optimize the prediction of job performance. Despite its compelling plausibility, this theory has now been disconfirmed. Differentially weighting multiple aptitudes produces little or no increase in validity over the use of measures of general mental ability. It has been found that tests of aptitudes measure in substantial part general ability; in addition, each measures something specific to that aptitude (e.g., specifically verbal aptitude, over and above general ability). The general mental ability component appears to be responsible for the prediction of job performance, whereas the factor specific to the aptitude appears to contribute little or nothing to prediction. The research showing that is presented and reviewed in Hunter (1986), Hunter (1980b), Jensen (1986), Olea and Ree (1994), Ree and Earles (1992), Ree, Earles, and Teachout (1994), Ree and Caretta (in press), and Schmidt et al. (1992), among other sources. The results of validation studies differ according to the measure of performance. Most studies have used performance ratings, many studies have used training success, a few studies have used promotion or job level, and a trickle of studies have used work sample measures of performance. Although Hunter and Hunter (1984) and subsequent reviews have accumulated considerable evidence as to promotion and work sample performance measures, we focus here on validities against performance ratings and training success measures. The basis of meta-analysis of over 400 studies, Hunter and Hunter (1984) estimated the validity of general ability to be .57 for high-complexity jobs (about 17% of all U.S. jobs), .51 for medium-complexity jobs (63% of jobs), and .38 for low-complexity jobs (20% of jobs). This is the largest database available using a measure of performance on the job. For performance in job training programs, a number of large databases are available, and many of these are based on military training programs, Hunter (1986) reviewed military databases totaling over 82,000 trainees and reported an average validity of .63 for general ability. This figure is similar to those reported in various studies by Ree and his associates (Ree & Carretta, in press; Ree & Earles, 1991), by Thorndike (1986), by Jensen (1980), and others. The relative validity of different predictors of job performance was reviewed by Schmidt and Hunter (in press). Among predictors that can be used for entry

INTELLIGENCE AND JOB PERFORMANCE

455

level hiring, none are close to general mental ability in validity. This review found that the next most valid predictor was integrity tests (shown by Ones, 1993, to measure mostly the personality trait of conscientiousness); integrity tests can be used along with general ability to yield a combined validity of .65. Among predictors suitable for hiring of workers already trained or experienced, only job sample tests (tests composed of samples of actual work from the job) and job knowledge tests are comparable in validity to general ability. As we will see later, job knowledge and job sample performance are consequences of general ability (Schmidt & Hunter, 1992). For the reasons discussed in Schmidt and Hunter (in press), predictors other than ability are best used as supplements to increment the validity of ability alone. On the basis of all available meta-analyses, Schmidt and Hunter (in press) present incremental validity and utility estimates for 15 such predictors for job performance and 8 for training performance. Some of these predictors are positively correlated with ability (e.g., the employment interview) and so are, in part, measures of ability. Experience, Ability, and Performance Overview Many labor unions have argued that promotion and retention should be based entirely on seniority. Implicit in this claim is the belief that performance is entirely determined by experience. In part, this theory is true. Learning plays a major role in the determination of job performance. Experience provides the medium for learning, and thus people with more experience have had an opportunity to learn more and hence achieve a higher level of job performance. Certainly, an employer should consider experience when hiring. What is missed in the labor union argument is individual differences in learning. If one worker learns twice as fast as another, the same amount of experience will produce a much higher level of performance in the fast learner than in the slower learner. That is, to say that experience is a key predictor of performance is not to say that intelligence plays little role. It is intelligence that turns experience into higher job knowledge and hence higher performance. This section will present evidence showing that on most jobs, there is a very large mass of information to be acquired and mastered, even though much of that information would be properly called trivia from the perspective of an outsider. Because there is so much to be learned, individual differences in learning play a major role in job mastery. Seniority advocates claim that the validity of mental ability is much diminished if workers are experienced. However, most existing validation studies are done with experienced workers. If experience eliminated the validity of ability, then most studies would find near zero validity. Consider the Hunter (1980c; Hunter & Hunter, 1994) analysis of 425 USES (U.S. Employment Service of the U.S. Department of Labor) studies predicting performance ratings. All of these studies were conducted assessing the performance of experienced workers. The mean level of plant experience was 61 months with a standard deviation of 48. Thus, about 14% of the workers had 9 years of experience, and 2% of workers had 17 years of experience. Yet, among workers with this high level of experience, the average predictive validity of the optimal

456

HUNTER AND SCHMIDT

General Aptitude Test Battery (GATE) ability composite predicting performance ratings is .51. Consider the Hunter (1986) study of predictive validity predicting objectively measured performance. Those studies were conducted in the same way as those in the USES database. Each study measured the performance of all workers then employed. Yet, the predictive validity of general cognitive ability predicting objectively measured performance was .75. The only studies with low-experience workers are the training success studies. In training studies, most workers start with zero experience and are assessed within a year of starting. The fact is that the myriad of studies predicting performance ratings have all been done with experienced workers, and each study contains some very highly experienced workers. These studies have found very high predictive validity for those highly experienced workers. Many have claimed that ability differences make little difference for experienced workers. That claim is false. Ability Differences Over Time Consider first the study by Schmidt, Hunter, Outerbridge, and Goff (1988). These researchers analyzed data for four military occupations in which workers had been assessed for job knowledge, objectively measured actual performance, and performance ratings. Their data allowed mean comparisons between high- and low-ability groups for each year of experience out to 5 years. For job knowledge, Schmidt et al. (1988) found large differences between the two ability groups at all levels of experience. The size of the difference was the same after 5 years as after 1 year of experience. For objectively measured job performance, Schmidt et al. (1988) again found large differences between the two ability groups at all levels of experience. The size of the difference was the same after 5 years as after 1 year of experience. For performance ratings, Schmidt et al. (1988) found definite though smaller differences between the two ability groups at all levels of experience up to 5 years. Again, the size of the difference was the same after 5 years as after 1 year of experience. There are much more data on supervisor ratings than those reviewed in Schmidt et al. (1988). The USES has gathered considerable data since the time of the Hunter (1980c) analysis. This data has been stored in individual form and can be reanalyzed in ways not envisioned by the original study-report writers. McDaniel (1985) analyzed these data to look at ability differences for groups differing in level of experience. The Schmidt et al. (1988) study showed that the differences in job knowledge, objectively measured performance, and supervisory performance ratings that are due to ability remain constant for levels of experience as far out as they had data to examine (i.e. for up to 5 years of experience). What is needed are data on performance ratings that extend past 5 years and are not contaminated by sample attrition problems. The USES database has these properties. McDaniel (1985) conducted the analysis needed to address this question. He correlated ability with performance ratings for each level of experience out to 12 years and up. The results are summarized in Table 3. As the level of experience goes up, the predictive validity does not decrease.

INTELLIGENCE AND JOB PERFORMANCE

457

Table 3 Correlation Between Ability and Performance Ratings for Job Incumbents of Various Levels of Job Experience Years of experience

Total sample size

0-3 3-6 6-9 9-12 12+

4,424 3,297 570 84 22

Ability with performance correlation

.35 .37 .44 .44 .59 Note. From The Evaluation of a Causal Model of Job Performance: The Interrelationships of General Mental Ability, Job Experience, and Job Performance (p. 76), by M. A. McDaniel, 1985, unpublished doctoral dissertation, George Washington University. Adapted with permission of the author.

Validity goes from .35 for 0-6 years to .44 for 6-12 years to .59 for more than 12 years (although the last value is based on one tiny sample). If anything, the McDaniel (1985) data suggest an increase in the predictive validity of ability predicting performance ratings as level of worker experience goes up. Predictive Validity of Experience Hunter and Hunter (1984) reanalyzed the results of 425 studies conducted by the USES to determine the validity of mental ability as measured by the GATE for predicting job performance ratings. Most of these studies (88%) also reported the predictive validity of experience. Hunter and Hunter found the mean predictive validity of experience to be .18 across the 373 studies. Many advocates of training as a way of increasing performance claim that the predictive validity of ability vanishes once workers get experience. We have seen that this claim is false. What about the other question: How relevant are experience differences among highly experienced workers? McDaniel, Schmidt, and Hunter (1988) studied this question using the USES individual worker database. The results are presented in Table 4. Training advocates claim that experience differences become more and more important as workers become more and more experienced. The data in Table 4 show this to be quite false. It is just the opposite. Differences in experience are very important among the newly hired: The correlation between experience and Table 4 Correlation Between Amount of Experience and Performance Ratings for Job Incumbents of Various Levels of Experience Years of experience

Total sample size

Experience with performance correlation

0-3 3-6 6-9 9-12 12+

4,490 5,088 3,588 1,274 1,618

.49 .32 .25 .19 .15

Note. From "Job Experience Correlates of Job Performance," by M. A. McDaniel, F. L. Schmidt, and J. E. Hunter, 1988, Journal of Applied Psychology, 73, p. 329. Copyright 1988 by the American Psychological Association. Adapted with permission of the authors.

458

HUNTER AND SCHMIDT

performance ratings is .49 for those who had been there for 0-3 years. This correlation drops rapidly to a low of. 15 for those who had been there 12 years and up. This is explained by other data presented in McDaniel (1985) showing the nonlinear relationship between experience and performance. The relation between experience and job performance shows the same shape as other learning curves: It is nonlinear and monotonic (Schmidt et al., 1988; Schmidt & Hunter, 1992). Ability Versus Experience as Predictors Many labor leaders assume that the predictive validity of experience is higher than the predictive validity of ability, especially in highly experienced workers. The studies reviewed above provide massive data on which to evaluate this claim. These data show this claim to be false. Hunter and Hunter (1984) examined each of the 425 USES studies predicting performance ratings for the predictive validity of both experience and ability. They found the average predictive validity for experience to be .18, whereas the predictive validity of ability is .51. Thus, the predictive validity of ability is nearly 3 times larger than the predictive validity of experience. Not only are seniority advocates wrong, but the difference in validity is very strongly in the opposite direction to their claims. Experience and ability in combination. Labor leaders usually claim that if a rational employer were to consider both ability and experience in combination, the employer should give far higher weight to experience than to ability. But this conclusion is based on the false belief that experience is a better predictor of job performance than is ability. The fact is that ability predicts performance ratings 3 times better than does experience. Because experience and ability are uncorrelated, this means that a rational employer would give 3 times as much weight to ability as to experience. Comparison over time. By comparing the data from Tables 5 and 6, the relative predictive validity of experience and ability for workers with different levels of experience can be seen. This comparison is shown in Table 5. Table 5 Comparison of the Correlation Between Ability and Performance Ratings With the Correlation Between Amount of Experience and Performance Ratings for Job Incumbents of Various Levels of Experience Years of experience

Ability with performance correlation3

Experience with performance correlationb

0-3 3-6 6-9 9-12 12+

.35 .37 .44 .44 .59

.49 .32 .25 .19 .15

"From The Evaluation of a Causal Model of Job Performance: The Interrelationships of General Mental Ability, Job Experience, and Job Performance (p. 76), by M. A. McDaniel, 1985, unpublished doctoral dissertation, George Washington University. Adapted with permission of the author. bFrom "Job Experience Correlates of Job Performance," by M. A. McDaniel, F. L. Schmidt, and J. E. Hunter, 1988, Journal of Applied Psychology, 73, p. 329. Copyright 1988 by the American Psychological Association. Adapted with permission of the authors.

459

INTELLIGENCE AND JOB PERFORMANCE

Table 6 Correlations Between General Cognitive Ability, Job Knowledge, Job Performance, and Supervisor Ratings

A JK WS SR

Civilian A 100 80 75 47

(N = 1,790) JK WS 80 75 100 80 80 100 56 52

Military (N = 1,474) SR 47 56 52 100

A JK WS SR

A 100 63 53 24

JK 63 100 70 37

WS 53 70 100 31

SR 24 37 31 100

Note. Data presented here are from Hunter (1983a) and have been corrected for range restriction. A = ability: General cognitive ability-that factor that explains the high correlations between different cognitive aptitudes (primary factors) and between achievement in diverse areas-estimated by a composite across various aptitudes such as quantitative and verbal aptitude; JK = job knowledge: A content valid (on the basis of job analysis) measure of job knowledge; WS = job performance: A content valid (on the basis of job analysis) work sample measure of job performance-performance at work stations where performance can be objectively measured. SR = supervisor ratings: Supervisor ratings of job performance-correlations are corrected for interrater reliability so these ratings are free of both random response error and halo.

Table 5 shows that for long-term workers, the correlation between experience and performance goes down and the correlation between ability and performance goes up. For very long-term workers, the predictive validity of ability is nearly 4 times larger than the predictive validity of experience. The data in Table 5 disconfirm nearly every conclusion about intelligence stated by advocates of seniority and training. Intelligence not only matters during the early stages of job learning but throughout the worker's tenure. This is becoming more and more true in current organizations because of the ever more rapidly changing product life cycles. Workers are learning new methods of production at ever shorter intervals. Furthermore, where new means change, articulated learning greatly reduces the effect of proactive interference because of negative transfer of training effects. This is another advantage for high-intelligence workers over low-intelligence workers. Learning: Job Knowledge, Ability, and Performance This section presents a theory explaining why there is such a high predictive validity for general cognitive ability. Hunter (1986) presented data accumulated from 14 research studies confirming that theory. Evidence from other sources was presented by Schmidt and Hunter (1992). The theory is composed of two parts: (a) the classic learning theory of Thorndike and (b) the theory of specific cognitive skills distilled from the factor analytic studies. Edward Thorndike's many studies of educational, vocational, and industrial training lead him to conclude that the main determinant of individual differences in job performance is individual differences in learning. Because general cognitive ability predicts effectiveness of learning, it predicts differences in job performance. Thus, the classic theory of human performance in the 1920s predicts a high validity for general cognitive ability predicting job performance. The factor analytic studies of human performance identified about 30 specific cognitive skills that are each used in many different tasks (French, 1951, 1954).

460

HUNTER AND SCHMIDT

Cognitive-skill job analyses virtually always find at least a half dozen of the 30 skills to be relevant to each job studied. The composite skill test for as many as 3 of these skills has been shown to be a measure of general cognitive ability. Thus, the universal pattern of relevance of the specific cognitive skills also implies high predictive validity for general cognitive ability predicting job performance. The fact that both theories are correct is shown in the path analysis of the data cumulated by Hunter (1986)—a study that will be reviewed here. Work sample performance measurement. Most psychologists believe that the best measure of job performance is a work- or job-sample measure. A work-sample measure of performance is obtained by setting up work stations where performance can be directly observed and measured. Performance scores are then added across work stations (possibly with more weight given to more important stations). Performance can sometimes be physically measured on dimensions of quantity and quality. Otherwise, performance is compared with preset benchmarks based on judgments of experts. Work-sample-performance measures provide a direct objective measure of job performance. Most psychologists would prefer to do validation studies using work-sample mesaurement of performance. However, this is usually infeasible. For example, for firefighters and police officers, there would be an immense number of work stations; there are 440 different tasks in firefighting and over 165 tasks for police work. Second, tasks such as accurate work in the presence of physical danger or under stresses such as emergencies may be difficult or dangerous to stage. Finally, even when work-sample measurement is feasible, it is usually prohibitively expensive. Thus, where there are thousands of validation studies done with supervisor ratings of performance, there are only a small number of studies using work samples. However, these studies are of high quality and provide a unique theoretical perspective. Learning and performance. The classic theory relating ability to performance derives its predictions from the learning process (Brolyer, Thorndike, & Woodyard, 1927). Learning may take place in a formal training environment, or it may take place on the job. The parameters of learning are different for the two environments. Learning in a formal training program means absorbing knowledge that is presented directly to the student with the important features of the knowledge already emphasized. Learning on the job requires two steps. First, if a relevant event takes place, the worker must recognize the event as significant. Second, the worker must be able to formulate the lesson inherent in the event in such a way as to learn from it. Cognitive ability is critical to the recognition process because the worker must link current information to the knowledge already in memory. Cognitive ability is necessary to learning from the recognition process because the information must be restructured to a form that allows future recognition. Thus, learning on the job will be even more dependent on cognitive ability than learning in a formal program. According to the classic theory, performance is bounded by learning. If the worker has not learned what to do in a given situation, then the worker cannot respond correctly. Thus, there should be a high correlation between learning and performance. Adaptation and application of learning. Learning is a necesary but not sufficient condition for performance. Performance may require that the worker go

INTELLIGENCE AND JOB PERFORMANCE

461

beyond knowledge of the job. Consider a police officer who suspects a violation of law in regard to a commercial transaction. Violations differ drastically from one another, especially if they stem from ignorance of the law. Only some situations will "fit the book," others will require adaptive skills such as interviewing, situation defusing, and exploratory information gathering. Consider a firefighter who enters a building where manufacturing takes place. The wise move depends on what chemicals are used in that plant, where they are located, how they are stored, and so on. In almost all situations, the worker must make decisions and adjustments to relate his or her knowledge to the situation at hand. This adjustment process is known as judgment or problem solving, or adaptation or application. A careful analysis of adaptation shows that many different specific cognitive skills are used in any given problem-solving context. That is, judgment or common sense is quite multidimensional. The factor analytic literature has identified many of those skills. The people who developed the classic learning theory of performance were well aware of the need for adaptive skills to link knowledge to performance. That is why the classic theory of performance predicts that cognitive ability will correlate with performance above and beyond the correlation determined by the correlation between cognitive ability and learning. Performance ratings. According to the classic theory, supervisors are mainly observers of performance. However, a supervisor's perceptions of performance are colored by a variety of non-work-related factors. That is, a supervisor will be influenced by all of the factors known to influence person perception, factors such as personal appearance, similarity of background to that of the supervisor, and moral conventionality. Furthermore, the classic theory would predict that supervisor perceptions will be influenced by idiosyncratic factors such as the match or mismatch between the personality of the worker and the personality of the supervisor. The classic theory predicts that supervisor performance ratings will be only an indirect measure of performance. Ratings of the same worker by different supervisors will disagree to the extent that perceptions are influenced by idiosyncratic factors. An average rating across a population of raters would eliminate the idiosyncratic component to ratings, but it would still leave nonwork factors that are common to all raters. Also, averaged ratings would still reflect shared rater perceptions of citizenship behavior. Thus, even if idiosyncratic factors are eliminated, the purified ratings will still not correlate perfectly with performance. Summary of classic theory. In summary, the classic theory relating job performance to cognitive ability makes a number of correlational predictions. Because the rate and amount of learning is determined by cognitive ability, the classic theory predicts a high correlation between cognitive ability and learning. Because performance is learned, the classic theory predicts a high correlation between learning and performance. Because innovative adaptation is required by most actual work situations, the classic theory predicts that cognitive ability will be even more highly correlated with performance than would be predicted from the high correlation between ability and learning. Supervisor ratings are predicted to be imperfect measures of performance in two ways: (a) Supervisor perceptions will disagree because supervisors are influenced by idiosyncratic nonwork factors

462

HUNTER AND SCHMIDT

such as the personality match or mismatch with the worker, and (b) supervisors will be influenced by nonwork factors that influence all supervisors. Specific cognitive skills. While Thorndike was doing his classic work, other psychologists were turning from the study of general work processes to the study of specific tasks. Over a 20-year period, this led to the discovery of many specific cognitive skills and to the development of standardized tests to measure those skills. This was the early factor analytic study of human performance. As of 1951, French found 151 studies that he could cumulatively analyze for identifiable skills. He identified 30 skills and put together the French kit, which contained easy to administer paper-and-pencil measures of each skill (most of which were not originally defined by paper-and-pencil tests). French's work continued. The Ekstrom, French, Barman, and Dermon (1976) review located 209 studies; Carroll (1986,1993) found over 300 studies he could reanalyze. Once the cognitive skills became known to industrial psychologists, they were incorporated into the job analysis process (Lawshe, 1952, 1975, 1984). Lawshe showed that a task analysis is only the first step in psychological job analysis. A task analysis identifies work products and objectives. The psychological analysis then identifies work processes, including the skills, decisions, and other actions used in the work. The skills may be cognitive perceptual (spotting the anomaly in a steel sheet), cognitive memory (remembering a complicated set of instructions), or cognitive reasoning (figuring out whether an awkward cart will fit around the corner on a flight of stairs). The decisions made may rely either on knowledge (how much does a sheet of steel weigh?) or on reasoning (putting the hot steel into the water will cause steam that will cause . . .). If knowledge is used, then performance will depend on learning ability. If reasoning or adaptation is required, then performance will depend on thinking skills. Full psychological job analyses have all found that for each task, several of the 30 skills in the French kit are relevant. Thus, a full job with many tasks—such as the 440 tasks of firefighting—will require a half dozen or more of those skills. This means that overall job performance will depend on a battery of cognitive skills. A test that measures total performance across several skills will function as a test of general cognitive ability above and beyond its function as a test of the specific skills measured. Thus, the cognitive skills literature taken together with the findings of psychological job analysis show that general cognitive ability tests also measure the impact of the specific cognitive skills used in immediate performance. The cognitive skills literature predicts that general cognitive ability will have even higher validity than that predicted by individual differences in learning. Testing the classic theory. The predictions of the classic theory can be tested empirically. In order to do this, each factor must be made observable. Cognitive ability was made observable by the testing research of the first 40 years of this century (Tyler, 1965; Vernon, 1956). The learning process can be measured after the fact by measuring job knowledge. The greater the worker's job knowledge, the greater the learning that has taken place. Job performance can be measured using work-sample methods. For theoretical purposes, there are at least four key variables to be observed in validation: general cognitive ability, job knowledge, job performance, and performance ratings. For simplicity, abbreviated language for these will be used in the following discussion. The word ability will mean general cognitive ability. The

INTELLIGENCE AND JOB PERFORMANCE

463

word knowledge will mean job knowledge. The word performance will mean work-sample performance. The word ratings will mean supervisor performance ratings. Each variable will be considered to be perfectly measured. Empirical data will be fully corrected for biases due to measurement error and range restriction to provide estimates of the corresponding correlations in an applicant population. Once the theory has been mapped into observed variables, then the theory can be tested by checking the observed correlations against predictions. The theory predicts a high correlation between ability and knowledge. The theory predicts a high correlation between knowledge and performance. These two facts taken together imply a high correlation between ability and performance. However, because of adaptation and the contribution of day-to-day cognitive skills to performance, the theory predicts an even higher correlation between ability and performance than would be predicted by the link of ability to knowledge. This prediction can be tested using the standardized multiple regression of performance onto ability and knowledge together. Standardized multiple regression reveals the relative weight given to each predictor in predicting performance. These two relative weights are called beta weights. The beta weight for ability will be large to the extent that ability makes a contribution to prediction that is above and beyond the prediction of performance made using job knowledge. The theory predicts that the beta weight for ability will be positive and large for jobs that require a high degree of innovation on the job. Because the supervisor is aware only of the worker's performance and job knowledge and is unaware of the worker's ability, a path model for the four variables should have no direct link between ability and ratings. The data. Hunter (1983a) located 14 studies that measured at least three of four key theoretical variables. He analyzed the correlations between them for an incumbent population because he was interested in performance appraisal in which the focal population are incumbents. For purposes of discussing personnel selection, the relevant population is the applicant population. Therefore, the correlations considered here will be corrected for range restriction using the average incumbent-applicant standard deviation ratio of .67 found for 425 job performance studies conducted by the U.S. Employment Service (Hunter, 1980b). These correlations were given in Hunter (1984, 1985, 1986). The basic results are presented in Figure 1 and Table 6. Table 6 presents the obtained correlations between ability, knowledge, performance, and ratings in civilian and military work. Figure 1 presents the path analysis that fits these data. The data are broken down separately for military and civilian studies. The differences in the results are quantitative rather than qualitative. Discussion of these differences is beyond the scope of this article. The predictions of the classic learning theory are borne out in the data. The theory predicts a high correlation between ability and knowledge. The correlation between ability and knowledge is .80 in the civilian data and .63 in the military data. The classic theory predicts a high correlation between knowledge and performance. The correlation between knowledge and performance is .80 in civilian data and .70 in military data. The classic theory predicts a high correlation between ability and performance. The correlation between ability and performance is .75 in civilian data and .53 in military data. The classic theory predicts that ability will be more highly

HUNTER AND SCHMIDT

464

Civilian Jobs

Military Jobs

Figure 1. A path analysis of cognitive ability, job knowledge, job performance, and supervisor ratings.

correlated with performance than can be explained solely on the basis of the correlation between ability and knowledge. The beta weight for ability (with knowledge held constant) is + .31 in the civilian data and +. 15 in the military data. The classic theory predicts that ability will be correlated with ratings. The correlation between ability and ratings is .47 in the civilian data and .24 in the military data. The classic theory predicts that a path model will fit the data without a direct link between ability and ratings. This, too, is true for both the civilian and the military data. Thus, the classic theory created by learning psychologists and supported by so much other data also fits the validation data. Every major prediction made by the classic theory is verified by the data. The classic theory is also supported by other similar path analytic studies; this research is discussed in Schmidt and Hunter (1992). Summary of ability, knowledge, and performance. For purposes of practical validation, it does not matter why cognitive ability correlates with performance. The only practical question for selection is just how high the validity is. This

INTELLIGENCE AND JOB PERFORMANCE

465

question is answered by the data on work-sample performance. However, it is very important to explain why general cognitive ability predicts job performance (Schmidt & Hunter, 1992). The data on job knowledge show that cognitive ability predicts job performance in large part because it predicts learning and job mastery. Table 6 shows that ability is highly correlated with job knowledge and that job knowledge is highly correlated with job performance. The path analysis in Figure 1 shows that this indirect causal path accounts for most of the effect of ability on performance. However, the beta weight (path coefficient) from ability to performance is large for civilian jobs and moderate for military jobs. Thus, ability impacts performance directly, not just through job knowledge. This confirms the prediction made by the cognitive skills literature. It shows that immediate cognitive skills are used in work: High-ability workers are faster at cognitive operations on the job, are better able to prioritize between conflicting rules, are better able to adapt old procedures to altered situations, are better able to innovate to meet unexpected problems, and are better able to learn new procedures quickly as the job changes over time. In many current validity studies, investigators have found that complex jobs are linked to a high degree of judgment, reasoning, and planning. The positive beta weight for ability in Figure 1 supports those linkage analyses. The evidence reviewed above clearly confirms the learning theory of job performance and the skills theory of job analysis. Large amounts of procedural knowledge are required for high performance in almost every job in the modern economy. Thus, both theories predict uniformly high predictive validity for general cognitive ability predicting job knowledge and actual job performance. Both theories predict lower but still substantial predictive validity for performance ratings. These predictions are all supported in the data. The path analysis verifies the theory in pattern of results as well as quantitative outcomes.

Social Policy Implications It is important to recognize that social policy has its limitations in this area, as in others. Certain powerful trends will continue regardless of the social policies adopted. People in the workforce will continue to sort themselves into jobs on the basis of mental ability, as demonstrated by Wilk and Sackett (1996) and Wilk, Desmarais, and Sackett (1995). Jobs and careers will continue to become more complex and more dependent on intelligence. Although there is a secular trend toward gradually increasing average mental-ability levels (Flynn, 1984), intelligence requirements in the world of work are increasing at a much faster rate, far outstripping intelligence gains. This problem is most severe for those in the lower half of the intelligence distribution; few new jobs suitable for below average individuals are being created, and many are being eliminated by automation and computerization. As a result, an increasing proportion of the potential workforce is being excluded from the world of work because of inadequate mental ability. In the United States, this process is most advanced among Black Americans, because the average ability level is lower for that group. However, this process is also operating in the White group and will become increasingly severe in that group with time. Some have argued that jobs are, in fact, still available today to low-ability

466

HUNTER AND SCHMIDT

people, albeit low-wage jobs (e.g., jobs in fast food restaurants). There is kernel of truth in this argument. However, in the past such jobs were typically starter jobs, the first rung on a ladder to higher paying jobs. Today, this ladder no longer exists for many people in the bottom half of the intelligence distribution, because the jobs above the lowest rung on the ladder have increased in complexity and therefore in ability requirements. For most of these individuals, this change removes the possibility of moving up to a wage capable of supporting a family. Although these low-wage jobs are suitable as sources of supplementary income for teenagers and students, they typically do not provide enough income (even in the case of combined incomes) to have and support a family. Serious social problems—for example, lowered marriage rates, increased marital instability and divorce, crime, and illegitimacy—result when large numbers of people cannot create and support families (Gottfredson, in press). These changes are driven by deeply rooted causes inherent in the dynamic of the United States and world economies and are therefore probably beyond the power of any feasible social policy to change. However, social policy is important and can have an impact. Current social policy strongly discourages hiring and placing people in jobs on the basis of intelligence, even when the consequences of not doing so are severe. For example, in Washington, DC, in the late 1980s mental ability requirements were virtually eliminated in the hiring of police officers, resulting in severe and socially dangerous decrements in the performance of the police force (Carlson, 1993a, 1993b). More recently, under pressure from the U.S. Department of Justice, changes in the selection process for police hiring in Nassau County, New York, were made that virtually eliminated mental ability requirements (Gottfredson, 1996). In a large U.S. steel company, reduction in mental ability requirements in the selection of applicants for skilled trades apprenticeships resulted in documented dramatic declines in quality and quantity of work performed (Schmidt & Hunter, 1981). As industrial psychologists, we are familiar with numerous cases such as these resulting from current social policy, although most have not been quantified and documented. This social policy also has a negative effect on U.S. international competitiveness in the global economy (Schmidt, 1993). The source of many of these policies has been the interpretation that the government agencies, such as the Equal Employment Opportunity Commission and the Department of Justice, and some courts, have placed on Title VII of the 1964 Civil Rights Act (and its subsequent amendments). Some minorities, in particular Blacks and Hispanics, typically have lower average scores on employment tests of aptitudes and abilities, resulting in lower hiring rates. The theory of adverse impact holds that such employment tests cause these differences rather than merely measure them. That is, this theory attributes the score differences and the hiring rate differences to biases in the tests. However, a large body of research shows that employment (and educational) tests of ability and aptitude are not predictively biased (Hartigan & Wigdor, 1989; Hunter, 1981b; Hunter & Schmidt, 1982a; Schmidt & Hunter, 1981; Schmidt et al., 1992; Wigdor & Garner, 1982). That is, the finding is that any given test score has essentially the same implications for future job performance for applicants regardless of group membership. For example, Whites and Blacks with low test scores are equally likely to fail on the job. Hence, research findings directly

INTELLIGENCE AND JOB PERFORMANCE

467

contradict the theory of adverse impact and the requirements that social policy has imposed on employers on the basis of that theory. The major requirement stemming from the theory of adverse impact has been costly and complicated validation requirements for any hiring and promotion procedures that show group disparities. In particular, employers desiring to select on the basis of ability must meet these expensive and time-consuming requirements. These requirements strongly encourage the abandonment of ability requirements for job selection, resulting in reduced levels of job performance and output, among all employees, not merely for minority employees (Schmidt & Hunter, 1981). What should social policy be in the area of employee selection? Social policy should encourage employers to hire on the basis of mental ability. The research findings discussed earlier show that such a policy is likely to maximize economic efficiency and growth (including job growth), resulting in increases in the general standard of living (Hunter & Schmidt, 1982b). However, social policy should also encourage use in hiring of those noncognitive methods known to both decrease minority-majority hiring rate differences and to increase validity (and, hence, job performance). That is, social policy should exploit research findings on the role of personality and other noncognitive traits in job performance to simultaneously reduce hiring rate differences and increase the productivity gains from selection methods (Ones, Viswesvaran, & Schmidt, 1993; Schmidt et al., 1992). Hunter and Schmidt (1977) pointed out that if an additional selection method exists (beyond mental ability) that the employer is not currently using and that is both valid and has little or no adverse impact, then there are two possibilities. The first possibility is that the employer is not aware of this method that could be added to his or her selection procedure. In that case, the employer is not discriminatory. The second possibility is that the employer is aware of this second method and yet does not add it to his or her selection procedure. This is defined as discriminatory. Today, many employers are in the first category: They are not aware of the research findings on personality tests measuring conscientiousness and integrity. The role of social policy should be to make them aware and therefore responsible for using these procedures whenever it is feasible to do so. The goal of current social policy is equal representation of all groups in all jobs and at all levels of job complexity. Even with fully nondiscriminatory and predictively fair selection methods, this goal is unrealistic, at least at the present time, because groups today differ in mean levels of job-relevant skills and abilities. They also differ in mean age and education level, further reducing the feasibility of this policy goal. The current pursuit of this unrealistic policy goal results not only in frustration but also in social disasters of the sort that befell the Washington, DC police force. This unrealistic policy goal should give way to a policy of eradication of all remaining real discrimination against individuals in the workplace. The chief industrial psychologists in a large manufacturing firm told one of us that his firm had achieved nondiscrimination in employment, promotion, and other personnel areas. He stated that the firm still had some discrimination against Blacks and women but that was balanced out by the discrimination (preference) in favor of Blacks and women in the firm's affirmative action programs. So, on balance, the firm was nondiscriminatory! Actually, the firm simply had two types of discrimina-

468

HUNTER AND SCHMIDT

tion, both of which should not have existed. Both types of discrimination cause employee dissatisfaction and low morale. Discrimination causes resentment and bitterness, because it violates deeply held American values. Defenders of the present policy often argue that the task of eliminating all individual-level discrimination is formidable, excessively time consuming and costly. They argue that use of minority hiring goals, time tables, and quotas is much more resource-efficient. However, it is also ineffective, socially divisive, and productive of social disasters of the type described earlier. The policy change recommended here should be made throughout the economy. However, it is particularly important that it be made in those segments of the economy that must compete internationally. Some segments of the economy do not compete internationally, for example, many service areas such as restaurants and food service, insurance, cosmetology and barbering, and much consulting. On the other hand, almost all types of manufacturing and many areas of agriculture do have to compete in the world economy. We may at minimum need a two-tier economy: a export-oriented tier, where hiring and placement can be on merit only, and an internally oriented tier, in which affirmative action requirements (with racial and ethnic preferences) may continue to be imposed. These requirements would reduce the efficiency of the domestic economy but affect all competitors equally. It is interesting that Japan and Korea, even in the absence of U.S.-type affirmative action requirements, have created such two-tier economies (Schmidt, 1993) in order to enhance international competitiveness. Their educational and hiring practices funnel the brightest workers into their export-oriented industries. It may be important for American international competitiveness that U.S. policy allow for a similar arrangement. Under current social policy, this would not be possible.

References Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A meta analysis. Personnel Psychology, 41, 1-26. Borman, W. C., White, L. A., & Dorsey, D. W. (1995). Effects of ratee task performance and interpersonal factors on supervisor and peer ratings. Journal of Applied Psychology, 80, 168-177. Brolyer, C. R., Thorndike, E. L., & Woodyard, E. R. (1927). A second study of mental discipline in high school studies. Journal of Educational Psychology, 18, 377-404. Carlson, T. (1993a, Winter). D.C. blues: The rap sheet on the Washington police. Policy Review, 27-33. Carlson, T. (1993b, November 3). Washington's inept police force. The Wall Street Journal. Carroll, J. B. (1986). A critical synthesis of knowledge about cognitive abilities. (Final Tech. Rep., NSF Grant BNS 8212486). Chapel Hill: University of North Carolina. Carroll, J. B. (1993). Human cognitive abilities. Cambridge, England: Cambridge University Press. Dunnette, M. D. (1972). Validity study results for jobs relevant to the petroleum industry. Washington, DC: American Petroleum Institute. Ekstrom, R. B., French, J. W, Harman, H. H., & Dermon, D. (1976). Kit of factorreferenced cognitive tests. Princeton, NJ: Educational Testing Service. Flynn, J. R. (1984). The mean IQ of Americans: Massive gains, 1932-1978. Psychological Bulletin, 95, 29-51.

INTELLIGENCE AND JOB PERFORMANCE

469

French, J. W. (1951). The description of aptitude and achievement factors in terms of rotated factors. Psychometric Monograph (No. 5). French, J. W. (1954). Manual of selected tests for reference aptitude and achievement factors. Princeton, NJ: Educational Testing Service. Ghiselli, E. E. (1966). The validity of occupational aptitude tests. New York: Wiley. Ghiselli, E. E. (1973). The validity of aptitude tests in personnel selection. Personnel Psychology, 26, 461^77. Gottfredson, L. S. (1996). Racially gerrymandering the content of police tests to satisfy the U.S. Justice Department: A case study. Psychology, Public Policy, and Law, 2, 418^46. Gottfredson, L. S. (in press). Why g matters: The complexity of everyday life. Intelligence. Hartigan, J. A., & Wigdor, A. K. (Eds.). (1989). Fairness in employment testing. Washington, DC: National Academy of Sciences Press. Hirsh, H. R., Northrup, L. C., & Schmidt, F. L. (1986). Validity generalization results for law enforcement occupations. Personnel Psychology, 39, 399-420. Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterion-related validities of personality constructs and the effect of response distortion on those validities. Personnel Psychology, 75, 581-595. Hunter, J. E. (1980a). Construct validity and validity generalization. In, Construct validity in psychological measurement (pp. 119-129). Princeton, NJ: Educational Testing Service. Hunter, J. E. (1980b). The dimensionality of the General Aptitude Test Battery (GATE) and the dominance of general factors over specific factors in the prediction of job performance. Washington, DC: U.S. Employment Service. Hunter, J. E. (1980c). Test validation for 12,000 jobs: An application of synthetic validity and validity generalization to the General Aptitude Test Battery (GATB). Washington, DC: U.S. Employment Service. Hunter, J. E. (1981a). The economic benefits of personnel selection using ability tests: A state-of-the-art review including a detailed analysis of the dollar benefit of U.S. Employment Service placements and a critique of the low-cutoff method of test use. Washington, DC: U.S. Employment Service. Hunter, J. E. (1981b). Fairness of the General Aptitude Test Battery (GATB): Ability differences and their impact on minority hiring rates. Washington, DC: U.S. Employment Service. Hunter, J. E. (1983a). A causal analysis of cognitive ability, job knowledge, job performance, and supervisor ratings. In F. Landy, S. Zedeck, & J. Cleveland (Eds.), Performance measurement theory (pp. 257-266). Hillsdale, NJ: Erlbaum. Hunter, J. E. (1983b). Validity generalization of the ASVAB: Higher validity for factor analytic composites. Rockville, MD: Research Applications. Hunter, J. E. (1984). The prediction of job performance in the civilian sector using the ASVAB. Rockville, MD: Research Applications. Hunter, J. E. (1985). Differential validity across jobs in the military. Rockville, MD: Research Applications. Hunter, J. E. (1986). Cognitive ability, cognitive aptitudes, job knowledge, and job performance. Journal of Vocational Behavior, 29, 340-362. Hunter, J. E. (1990). The validity of the Denver firefighters examination. Denver, CO: City of Denver, City Attorney. Hunter, J. E., & Hirsh, H. R. (1987). Applications of meta-analysis. In C. L. Cooper & I. T. Robertson (Eds.), Review of industrial psychology (Vol. 2, pp. 321-357) New York: Wiley. Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternate predictors of job performance. Psychological Bulletin, 96, 72-98.

470

HUNTER AND SCHMIDT

Hunter, J. E., & Schmidt, F. L. (1977). A critical analysis of the statistical and ethical definitions of test fairness. Psychological Bulletin, 83, 1053-1071. Hunter, J. E., & Schmidt, F. L. (1982a). Ability tests: Economic benefits versus the issue of fairness. Industrial Relations, 21, 293-308. Hunter, J. E., & Schmidt, F. L. (1982b). Fitting people to jobs: Implications of personnel selection for national productivity. In E. A. Fleishman & M. D. Dunnette (Eds.), Human performance and productivity: Human capability assessment (Vol. 1, pp. 233-284). Hillsdale, NJ: Erlbaum. Hunter, J. E., Schmidt, F. L., & Judiesch, M. K. (1990). Individual differences in output variability as a function of job complexity. Journal of Applied Psychology, 75, 28-42. Jensen, A. R. (1980). Bias in mental testing. New York: Free Press. Jensen, A. R. (1986). g: Artifact or reality? Journal of Vocational Behavior, 29, 301-331. Lawshe, C. H. (1952). What can industrial psychology do for small business: 2. Employee selection. Personnel Psychology, 5, 31-34. Lawshe, C. H. (1975). A quantitative approach to content validity. Personnel Psychology, 28, 563-575. Lawshe, C. H. (1984, October). A practitioner's thoughts on job analysis. Paper presented at Content Validity III, Bowling Green State University, Bowling Green, OH. McDaniel, M. A. (1985). The evaluation of a causal model of job performance: The interrelationships of general mental ability, job experience, and job performance. Unpublished doctoral dissertation, George Washington University. McDaniel, M. A., Schmidt, F. L., & Hunter, J. E. (1988). Job experience correlates of job performance. Journal of Applied Psychology, 73, 327-330. Motowidlo, S. J., & Van Scotter, J. R. (1994). Evidence that task performance should be distinguished from contextual performance. Journal of Applied Psychology, 79, 475-480. Mount, M. K., & Barrick, M. R. (1995). The Big Five personality dimensions: Implications for research and practice in human resources management. In G. Ferris (Ed.), Research in personnel and human resources management (Vol. 13, pp. 153-200). Greenwich, CT: JAI Press. Northrup, L. C. (1986). Validity generalization results for apprentice and helper-trainer positions. Washington, DC: U.S. Office of Personnel Management, Office of Staffing Policy. Olea, M. M., & Ree, M. J. (1994). Predicting pilot and navigator criteria: Not much more than g. Journal of Applied Psychology, 79, 845-851. Ones, D. S. (1993). The construct validity of integrity tests: Unpublished doctoral dissertation, University of Iowa. Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of integrity test validities: Findings and implications for personnel selection and theories of job performance. Journal of Applied Psychology, 78, 679-703. Organ, D. W. (1990). Organizational citizenship behavior: The good soldier syndrome. Lexington, MA: Lexington Books. Organ, D. W., & Konovsky, M. A. (1989). Cognitive versus noncognitive determinants of organizational citizenship behavior. Journal of Applied Psychology, 74, 157-164. Organ, D. W, & Ryan, K. (1995). A meta-analytic review of attitudinal and dispositional predictors of organizational citizenship behavior. Personnel Psychology, 48, 775-802. Orr, J. M., Sackett, P. R., & Mercer, M. (1989). The role of prescribed and nonprescribed behaviors in estimating the dollar value of performance. Journal of Applied Psychology, 74, 34-40. Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980). Validity generalization results for tests used to predict training success and job proficiency in clerical occupations. Journal of Applied Psychology, 65, 373^06. Ree, M. J., & Carretta, T. R. (in press). General cognitive ability and occupational

INTELLIGENCE AND JOB PERFORMANCE

471

performance. In C. L. Cooper & I. T. Robertson (Eds.), International review of industrial/organizational psychology, 1998. London: Wiley. Ree, M. J., & Earles, J. A. (1991). Predicting training success: Not much more than g. Personnel Psychology, 44, 321-332. Ree, M. J., & Earles, J. A. (1992). Intelligence is the best predictor of job performance. Current Directions in Psychological Science, 1, 86-89. Ree, M. J., Earles, J. A., & Teachout, M. (1994). Predicting job performance: Not much more than g. Journal of Applied Psychology, 79, 518-524. Reilly, R. R., & Chao, G. T. (1982). Validity and fairness of some alternative employee selection procedures. Personnel Psychology, 35, 1-62. Sackett, P. R., & Wanek, J. E. (1996). New developments in the use of measures of honesty, integrity, conscientiousness, dependability, trustworthiness and reliability for personnel selection. Personnel Psychology, 49, 787-830. Schmidt, F. L. (1993). Personnel psychology at the cutting edge. In N. Schmitt & W. Borman (Eds.), Personnel selection (pp. 497-515). San Francisco: Jossey Bass. Schmidt, F. L., & Hunter, J. E. (1981). Employment testing: Old theories and new research findings. American Psychologist, 36, 1128-1137. Schmidt, F. L., & Hunter, J. E. (1983). Individual differences in productivity: An empirical test of estimates derived from studies of selection procedure utility. Journal of Applied Psychology, 68, 407^14. Schmidt, F. L., & Hunter, J. E. (1992). Causal modeling of processes determining job performance. Current Directions in Psychological Science, 1, 89-92. Schmidt, F. L., & Hunter, J. E. (in press). Measurable personnel characteristics: Stability, variability, and validity for predicting future job performance and job related learning. In M. Kleinmann & B. Strauss (Eds.), Instruments for potential assessment and personnel development. Gottingen, Germany: Hogrefe. Schmidt, F. L., Hunter, J. E., & Caplan, J. R. (1981a). Selection procedure validity generalization (transportability) results for three job groups in the petroleum industry. (Unpublished technical report, available from John E. Hunter) Schmidt, F. L., Hunter, J. E., & Caplan, J. R. (1981b). Validity generalization results for two job groups in the petroleum industry. Journal of Applied Psychology, 66, 261-273. Schmidt, F. L., Hunter, J. E., McKenzie, R., & Muldrow, T. (1979). The impact of valid selection procedures on workforce productivity. Journal of Applied Psychology, 64, 609-626. Schmidt, F. L., Hunter, J. E., Outerbridge, A. N., & Trattner, M. H. (1986). The economic impact of job selection methods on the size, productivity, and payroll costs of the Federal work-force: An empirical demonstration. Personnel Psychology, 39, 1-29. Schmidt, F. L., Hunter, J. E., Outerbridge, A. N., & Goff, S. (1988). The joint relation of experience and ability with job performance: A test of three hypotheses. Journal of Applied Psychology, 73, 46—57. Schmidt, F. L., Ones, D., & Hunter, J. E. (1992). Personnel selection. Annual Review of Psychology, 43, 627-670. Schmitt, N., Gooding, R. Z., Noe, R. A., & Kirsch, M. (1984). Meta-analyses of validity studies published between 1964 and 1982 and the investigation of study characteristics. Personnel Psychology, 37, 407—422. Thorndike, R. L. (1986). The role of general ability in prediction. Journal of Vocational Behavior, 29, 332-339. Trattner, M. H. (1988). The validity of aptitude and ability tests used to select professional personnel. Washington, DC: U.S. Office of Personnel Mangement, Personnel Research and Development Center. Tyler, L. E. (1965). The psychology of human differences. New York: Appleton-CenturyCrofts.

472

HUNTER AND SCHMIDT

Vernon, P. F. (1956). Measurement of abilities (2nd ed.). London: University of London Press. Vineberg, R., & Joyner, J. N. (1982). Prediction of job performance: Review of military studies. Alexandria, VA: Human Resources Research Organization. Wigdor, A. K., & Garner, W. R. (Eds.). (1982). Ability testing: Uses, consequences, and controversies (Report of the National Research Council Committee on Ability Testing). Washington, DC: National Academy of Sciences Press. Wilk, S. L., Desmarais, L. B., & Sackett, P. R. (1995). Gravitation to jobs commensurate with ability: Longitudinal and cross-sectional tests. Journal of Applied Psychology, 80, 79-85. Wilk, S. L., & Sackett, P. R. (1996). Longitudinal analysis of ability-job complexity fit and job change. Personnel Psychology, 49, 937-967.

Dannemiller Appointed Editor of Developmental Psychology, 1999-2004 The Publications and Communications Board of the American Psychological Association announces the appointment of James L. Dannemiller, PhD, University of Wisconsin, as editor of Developmental Psychology for a 6-year term beginning in 1999. Effective January 1, 1998, manuscripts should be directed to James L. Dannemiller, PhD Developmental Psychology Journal Office Room 555 Waisman Center University of Wisconsin—Madison 1500 Highland Avenue Madison, WI 53706-1611 email: [email protected]