Assessing Measurement Invariance of the Children's Depression ...

4 downloads 4859 Views 684KB Size Report
Sep 12, 2011 - informed consent statement and completed brief demo- ... their own independent desk to complete the CDI. Measures .... To help evaluate the forms of measurement invariance, we used ..... Encino, CA: Multivariate Software.
Assessment http://asm.sagepub.com/

Assessing Measurement Invariance of the Children's Depression Inventory in Chinese and Italian Primary School Student Samples Wenfeng Wu, Yongbiao Lu, Furong Tan, Shuqiao Yao, Patrizia Steca, John R. Z. Abela and Benjamin L. Hankin Assessment 2012 19: 506 originally published online 12 September 2011 DOI: 10.1177/1073191111421286 The online version of this article can be found at: http://asm.sagepub.com/content/19/4/506

Published by: http://www.sagepublications.com

Additional services and information for Assessment can be found at: Email Alerts: http://asm.sagepub.com/cgi/alerts Subscriptions: http://asm.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations: http://asm.sagepub.com/content/19/4/506.refs.html

>> Version of Record - Nov 4, 2012 OnlineFirst Version of Record - Sep 12, 2011 What is This?

Downloaded from asm.sagepub.com at Central Library Alzahra Univ on November 1, 2013

421286 3191111421286Wu et al.Assessment © The Author(s) 2012

ASM19410.1177/107

Reprints and permission: sagepub.com/journalsPermissions.nav

Assessing Measurement Invariance of the Children’s Depression Inventory in Chinese and Italian Primary School Student Samples

Assessment 19(4) 506­–516 © The Author(s) 2012 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/1073191111421286 http://asm.sagepub.com

Wenfeng Wu1, 2, Yongbiao Lu2, Furong Tan1, Shuqiao Yao1, Patrizia Steca3, John R. Z. Abela4*, and Benjamin L. Hankin5 Abstract This study tested the measurement invariance of Children’s Depression Inventory (CDI) and compared its factorial variance/covariance and latent means among Chinese and Italian children. Multigroup confirmatory factor analysis of the original five factors identified by Kovacs revealed that full measurement invariance did not hold. Further analysis showed that 4 of 21 factor loadings, 14 of 26 intercepts, and 12 of 26 item errors were noninvariant. Factor variance and covariance invariant tests revealed significant differences between Chinese and Italian samples. The latent factor mean comparison suggested no significant difference across the two groups. Nevertheless, the finding of partial metric and scalar invariance suggested that observed mean differences on the CDI items cannot be fully explained by the mean differences in the latent factor. These results suggest that researchers and practitioners exercise caution when gauging the size of the true national population differences in depressive symptoms among Italian and Chinese children when assessed via CDI. In addition to providing needed evidence on the use of the CDI in Italian and Chinese children specifically, the methods used in this research can serve more generally as an example for other cross-cultural assessment research to test structural equivalence and measurement invariance of scales and to determine why it is important to do so. Keywords Children’s Depression Inventory, measurement invariance, latent mean comparison, Chinese children, Italian children

The Children’s Depression Inventory (CDI; Kovacs, 2003) is one of the most widely used and studied scales for assessing depressive symptoms in children and adolescents (Klein, Dougherty, & Olino, 2005; Twenge & Nolen-Hoeksema, 2002). The CDI was developed by Kovacs and colleagues in 1977 and has been used most often to measure depressive symptoms among normal populations of children. Previous research has shown that the CDI is adequately reliable and valid with respect to depressive symptoms (Craighead, Smucker, Craighead, & Ilardi, 1998; Crowley, Worchel, & Ash, 1992; Finch, Saylor, & Edwards, 1985; Kovacs, 1980; Smucker, Craighead, Craighead, & Green, 1986; Weiss et al., 1991). Finally, the CDI is a universal screening instrument and is effective for identifying positive cases and ruling out negative cases with a high degree of certainty (Levitt, Saka, Hunter Romanelli, & Hoagwood, 2007). The CDI has been translated into several languages and validated in numerous countries (Kovacs, 2003). Outside of

North America, CDI data have been collected from countries such as China (Yu & Li, 2000), Japan (Murata, Tsutsumi, Sarada, & Nakaniwa, 1992), Korea (Cho & Lee, 1990), Malaysia (Rosliwati, Rohayah, Jamil, & Zaharah, 2008), Greece (Giannakopoulos et al., 2009), Italy (Poli, Sbrana, Marcheschi & Masi, 2003), and Sweden (Ivarsson,

*John R. Z. Abela has passed away. 1 Central South University, Changsha, Hunan, People’s Republic of China 2 Hunan University of Science and Technology, Xiangtan, Hunan, People’s Republic of China 3 University of Milan–Bicocca, Milan, Italy 4 Rutgers–The State University of New Jersey, New Brunswick, NJ, USA 5 University of Denver, Denver, CO, USA Corresponding Author: Shuqiao Yao, The Medical Psychological Institute of the Second Xiangya Hospital, Central South University, Changsha, Hunan 410011, Republic of China Email: [email protected]

Wu et al. Svalander, & Litlere, 2006). With regard to Chinese cultures, the CDI has been validated in China mainland (Dong, Yang, & Ollendick, 1994; Yu & Li, 2000), Hong Kong (Chan, 1997), and Taiwan (Liu, 2003). Using exploratory factor analysis, Chan (1997) found that the factor structure of CDI among individuals from Hong Kong was similar to that reported by Kovacs (2003). But in an Italian sample, seven factors were extracted using exploratory factor analysis (Poli et al., 2003); this differed from Kovacs’s original fivefactor model with the CDI. Finally, in a cross-national study, Charman and Pervova (2001) reported that the CDI was equally reliable with samples of schoolchildren from Russia and the United Kingdom. Although depressive symptoms may be a universal latent trait existing across cultures (Weissman et al., 1996), this does not necessarily imply that children’s depressive symptoms are measured equivalently across cultural groups. The major focus of our study was to determine whether the CDI exhibits measurement invariance (or measurement equivalence) across two national samples. Empirical evidence of measurement invariance is a prerequisite for considering the scale to be consistent across different populations and is a condition for making valid and interpretable comparisons of the differences in the scores (Billiet, 2002; Little, 1997; Widaman & Reise, 1997). In other words, measurement equivalence needs to be established to evaluate whether observed mean differences in manifest scale scores indicates the presence or absence of true differences on the latent trait. Given the widespread use of CDI and the frequent appearance of comparison between different countries and cultural groups (e.g., Chan, 1997; Charman & Pervova, 1996; Frigerio et al., 2001; Poli et al., 2003), it is notable that no study has examined measurement invariance of CDI across cultural contexts. We believe that such a study is important. For example, the Beck Depression Inventory (BDI) is used to assess depressive symptoms in adults and is very similar to the CDI in form and purpose. Some crossnational measurement invariant studies of the BDI (e.g., Byrne & Campbell, 1999; Byrne, Stewart, Kennard, & Lee, 2007; Nuevo et al., 2009) have been reported, and these studies provide important information on the cross-national validity and latent mean differences of the BDI. Given that the CDI is a downward extension of the BDI, conducting a cross-cultural group measurement invariance study of CDI will fill an important gap in the literature and enable enhanced understanding of its potential cross-cultural validity and latent mean differences. A few prior studies have examined measurement invariance in the CDI using different groups. Investigating potential gender differences, Carle, Millsap, and Cole (2008) inspected CDI measurement invariance in gender using confirmatory factor analysis for ordered-categorical measures (CFA-OCM) and rating scale item response theory (IRT) analyses. Their results supported the use of the CDI across genders. Examining potential racial differences, Steele

507 and Stephen (2006) used samples of African American and European American youth with multigroup confirmatory factor analyses (MGCFAs) based on the original five factors identified by Kovacs (2003). Results revealed that CDI items exhibited invariant measurement properties across samples. Building on these prior studies, and considering that CDI data from this study were obtained from Chinese and Italian samples, in this study we used MGCFA procedures to examine measurement invariance in samples of Chinese and Italian children. Given that these two prior invariance studies were conducted with American samples, it is important to consider potential similarities and differences among American, Italian, and Chinese cultural groups. Similar to American and Western European cultures, Italian culture values sociability, assertiveness, and self-expression (Casiglia, Lo Coco, & Zappulla, 1998). On the contrary, Chinese culture emphasizes self-restraint, cautiousness, and cooperation (Chen, Liu, & Li., 2000). These culture differences have been labeled as Individualism and Collectivism (Hofstede, 1993; Markus & Kitayama, 1991). Usually, researchers conceptualize individualism as the opposite of collectivism (e.g., Hui, 1988). In individualistic cultures, people are socialized to value uniqueness and individual autonomy and hold positive views of themselves (e.g., Mezulis, Abramson, Hyde, & Hankin, 2004) and their accomplishments, and doing so is a considered to be a sign of mental health. However, in collectivist cultures, individuals tend to be socialized to be part of an interdependent group, avoid interpersonal tension through self-criticism, and exhibit moderate emotional expression, and feeling good about oneself may be a sign of maladjustment (Diener & Diener, 1995). Kleinman (1977) found that 88% of Taiwanese psychiatric patients in his sample initially reported somatic complaints without affective complaints, whereas the comparable figure for European Americans was 20%. Based on these findings, Kleinman (1977) hypothesized that collectivist cultural groups may tend to respond to stressful life events in ways that emphasize somatic rather than affective symptoms. Affective, intrapsychological aspects of depressive symptoms may be experienced by collectivist cultural groups as more self-centered and, hence, may be more destructive to group harmony than somatic symptoms. Although Italian culture tends to align mostly with individualism, and Chinese culture is most consistent with collectivism, the two cultures have some common points (Broilo, Spinola, & Duzert, 2010). Both cultures place strong emphasis on the importance of the family (Attili, Vermigli, & Schneider, 1997). Family conflict and disruption is one important factor affecting risk to youth depression (Abela & Hankin, 2008). Overall, given these cultural differences and similarities, we posited that there might be both some differences and commonalities when examining the measurement invariance of CDI across samples of Chinese and Italian children.

508 A variety of procedures are available for examining whether measures are commensurate across populations. One method is MGCFA (Jöreskog, 1971; Meredith, 1993). These methods assess the underlying latent structure of a test by explaining covariance between the test items through a limited number of factors. The MGCFA approach includes both the mean and covariance structures (MACS) in the analyses (i.e., MACS models). Such analyses compare the factor loadings as well as the latent mean structures across samples. As with traditional covariance-based CFA, MACS analyses explicitly adjust both the loadings and the means for measurement error, thereby allowing a clear examination of the full psychometric properties of the CDI items. Moreover, MACS analyses provide the ability to explicitly examine the measurement equivalence of the CDI (Little, 1997; Little & Slegers, 2005). Before examining the measurement invariance in samples of Chinese and Italian children, a suitable structure of CDI should be determined to build the baseline model. The first aim of this study was to explore the measurement invariance of the baseline model across children sampled from China and Italy. Based on our review and results from Steele and Stephen’s (2006) study, we hypothesized that configural invariance and weak factorial invariance would hold. Furthermore, we expected that strong and strict invariance would not be fully supported, although partial invariance may exist. Using exploratory factor analysis, many studies in different countries have obtained different factor structures for the CDI (Charman & Pervova, 2001; Poli et al., 2003; Samm et al., 2008), but few studies used confirmatory factor analysis to confirm the CDI structure. Thus, our second aim was to compare factor variance/covariance between the two samples using multigroup confirmatory analysis. Again, based on the limited prior research (Steele & Stephen, 2006), we expected that factor variance/ covariance would exhibit partial invariance. The last purpose of this study was to compare the latent variable means on the CDI between the two cultural groups. Given differences between Italian and Chinese cultures (ChentsoveDutton & Tsai, 2008), we posited that the full factor latent mean invariance would not hold, but partial invariance may be supported.

Method Participants Kaplan and George (1995) reported that when sample sizes are equal, the power of the Wald test for detecting latent mean differences was relatively robust to violations of factorial invariance. With increasingly unequal sample sizes, power decreases substantially, even when full loading invariance exists. For these reasons, we sought to make the size of our two samples from China and Italy equal.

Assessment 19(4)

Chinese Sample The first sample consisted of 550 Chinese children (285 males, 265 females) from two urban primary schools in Changsha and Xiangtan city, Central China. The age of the participants ranged from 7 to 10 years (M = 7.78, SD = 0.62). Of the 550 children sampled, 314 were from Grade 3, and 236 were from Grade 2. All participants were born in China. The Han nationality accounted for 95%, and the minority accounted for 5%. Changsha is a medium-sized city with a population of more than 2.36 million. The annual per capita disposable income of urban residents was RMB 20,238 (US$2,962.8) in 2008, and the urbanization rate was 62.63% (Hunan Provincial Bureau of Statistics, China, 2009a). Xiangtan is a medium to small city with a population of more than 1.47 million. The annual per capita disposable income of urban residents was RMB 16,109 (US$2,358.6) in 2008, and the urbanization rate was 49.90% (Hunan Provincial Bureau of Statistics, China, 2009b). Among 222 cities of China, the two cities’ per capita gross domestic product were ranked 21 and 66, respectively (National Bureau of Statistics of China, 2009).

Italian Sample The second sample consisted of 540 Italian children (259 males, 244 females, and 37 unidentified) from 11 urban primary schools in Milan, Northern Italy. Milan is one of the world’s major financial and business centers and has a high urbanization rate. There are nearly 1.3 million citizens, and the gross domestic product per capita of the city was about €30,468 (about US$44,000; Mingione, Zajczyk, Mugnano, & Sedini, 2009). The age of the participants ranged from 7 to 9 years (M = 7.94, SD = 0.62). Of the 540 children sampled, 290 were from Grade 3, and 213 were from Grade 2. Grade information was not available for 37 participants. All participants were ethnically Italian.

Procedure For both Chinese and Italian samples, parents signed an informed consent statement and completed brief demographic questionnaires before data collection commenced in the schools. As students were children, they provided informed assent. In the autumn of the academic year, psychology masters’ students and advanced undergraduates administered the questionnaire. One of the research assistants administered the CDI to one class at a time during regular school time by reading each item out loud. Children proceeded at the same pace. One additional research assistant was available to answer questions before, during, and after questionnaire administration. It is possible that the context of data collection may not have been identical across sites. Given this possibility, we

509

Wu et al. took the following measures to minimize possible context differences between Chinese and Italian samples: (a) We administered the CDI according to the “step-by-step administration procedure” of the CDI Manual (Kovacs, 2003, pp. 8-9); (2) Both Chinese and Italian children completed the CDI using group administration in classrooms; (3) Both Chinese and Italian research assistants were trained to administer the CDI in a uniform manner; (4) All CDI items were read aloud to all children from both Chinese and Italian samples in order to decrease possible differences in reading comprehension, and no student discussion was permitted during CDI completion; and (5) All students had their own independent desk to complete the CDI.

Measures The Chinese version of the CDI was developed using the back-translation method. First, the original version was translated into Chinese by one bilingual translator from the Psychology Department at Central South University (Changsha, Hunan). Next, the Chinese version was backtranslated into English by another bilingual translator from the Psychology Department at Rutgers University. Finally, the original version of the CDI was compared with the back-translation. If discrepancies arose in the back-translation, translators worked cooperatively to make corrections to the Chinese version. Finally, Chinese and Italian versions of CDI were compared by bilingualists at Rutgers University, and no discrepancy was found. The CDI (Kovacs, 2003) is a 27-item self-report measure of cognitive, affective, and behavioral symptoms of childhood depression. The measure was developed as a downward extension of the BDI (Beck, 1961). The CDI consists of five subscales: Anhedonia, Negative Mood, Negative Self-Esteem, Ineffectiveness, and Interpersonal Problems. Each item consists of three statements graded in order of increasing severity. Responses range from 0 to 2. The higher the numerical value, the more clinically severe the symptom is rated. Children select the sentence that best describes themselves for the past 2 weeks. In the current study, the suicide item, Item 9, was dropped in the Italian sample because of school administration concerns. Thus, only 26 items from both samples are analyzed in this report. The Chinese version of CDI was translated by Wu, Lu, Tan, and Yao (2010), and the Italian version of the CDI was edited by Camuffo in collaboration with Kovacs (1988).

Data Analysis We used the MASC to test the cross-cultural group invariance of the CDI. We used EQS Version 6.1 (Bentler, 2005). According to Widaman and Reise (1997), the model for MASC analysis can be written as follows:



ˆ , (1) Mg ≅ ˆτg ˆτ'g +ˆΛ g (αˆ g ˆα'g + ˆψg )ˆ'Λg +ˆΘeg = M g

where M is a p × p moment matrix (or matrix of raw sums of squares and cross-products of the measured variˆ is the estimated population moment matrix ables), M assuming the model is correctly specified, and other symbols plus ^ means its estimated matrix; the subscript “g” denotes that the matrices were derived from the gth sample. τ is a p × 1 vector of intercepts for the p measured variables; Λ is a p × m matrix of the loadings of the p measured variables on the m latent variables; α is an m × 1 vector of means on the m factors; Ψ is an m × m matrix of covariances among the common factor scores, and Θ is a p × p matrix of covariances among the measurement residuals. Based on Meredith (1993), Widaman and Reise (1997) delineated three forms of factorial invariance: weak, strong, and strict. For weak factorial invariance, the factor loadings are constrained to be equivalent across groups, so Equation (1) is transformed to Equation (2):

ˆ . (2) Mg ≅ ˆτg ˆτ'g +ˆΛ (αˆ g ˆα 'g + ˆψg )ˆ'Λ +ˆΘeg = M g

The subscripts “g” of Λ were removed. In Equation (2), this means there is no difference in the factor loadings between groups. For strong factorial invariance, variable intercepts are constrained to be invariant across groups, and Equation (2) is changed to Equation (3):

ˆ . (3) Mg ≅ ˆτ ˆτ '+ˆΛ (αˆ g ˆα 'g + ˆψg )ˆ'Λ +ˆΘeg = M g

As with Equation (2), subscript “g” of τ is removed. This means variable intercepts were equivalent across groups. Finally, for strict factorial invariance, Equation (4) is specified:

ˆ . (4) Mg ≅ ˆτ ˆτ' +ˆΛ (αˆ g ˆα 'g + ˆψg )ˆ’ Λ +ˆΘe = M g

The subscript “g” of Θ was removed. This means there is no difference of covariances among the measurement residuals between groups. To help evaluate the forms of measurement invariance, we used fit indexes. We reported the root mean square error of approximation (RMSEA), which is one of the most sensitive indices to models with misspecified factor loadings (Hu & Bentler, 1998). Hu and Bentler (1999) recommend that RMSEA values less than 0.06 are needed for the model to adequately describe the data. We also report the comparative fit index (CFI). Blackburn, Donnelly, Logan, and Renwick (2004) indicated that CFI values above 0.90 represent an adequate fit. In cases when the data violated the multivariate normality assumption, the Satorra–Bentler (SB) rescaled χ2 was used. Furthermore, robust maximum likelihood estimation was used in CFA to correct for this

510

Assessment 19(4)

Table 1. The Correlation Coefficients (r) and Cronbach’s Alpha Coefficient for Scale and Subscales of CDI in Chinese and Italian Samples Chinese sample (n = 550) Subscale 1. Anhedonia 2. Negative Mood 3. Negative Self-Esteem 4. Ineffectiveness 5. Interpersonal Problems 6. CDI total Cronbach’s α

1

2

— .63 .59 .58 .47 .89 .75

— .52 .56 .42 .82 .63

3

— .52 .34 .74 .57

4

— .37 .76 .50

Italian sample (n = 540) 5

— .64 .40

6

1

2

— .87

— .56 .54 .45 .54 .85 .66

— .58 .49 .52 .82 .66

3

— .49 .47 .77 .71

4

— .48 .72 .60

5

6

— .73 .56

          — .87

Note. CDI = Children’s Depression Inventory.

violation. The robust comparative fit index (*CFI) and robust root mean square error of approximation (*RMSEA) was also used (Byrne, 2006). When comparing models against one another, we used the differences in chi-square and *CFI. The significant differences of ΔSBχ2 indicate that the more restrictive model fits the data significantly worse than the less restrictive model and should not be retained. For the difference of *CFI, Cheung and Rensvold (2002) suggested that its difference value should not exceed 0.01.

Results Building the Baseline Model Since the five-factor structure of the CDI identified by Kovacs (2003) has been tested using CFA and been shown to be suitable to Chinese children samples (Lam, 1999; Wu et al., 2010), this five-factor structure was used as the starting point and tested separately for goodness of fit to the Chinese and Italian data. For the Chinese sample, the results were as follows: SBχ2 = 462.081, degrees of freedom = 289, p < .001, *CFI = 0.90, *RMSEA = 0.033, 90% confidence interval = 0.027, 0.038. Using the Lagrange multiplier (LM) test, we found that Item 5 to Item 2 (Negative Mood) had the largest standardized residual (LMχ2 = 33.99), and Item 22 to Item 10 had the second largest standardized residual (LMχ2 = 26.26). The LM test is analogous to the modification indices (MI) in LISREL, although the LM test operates multivariately in determining misspecifed parameters in a model (Byrne, 2006). The results for the Italian sample were as follows: SBχ2 = 432.496, p < .001, *CFI = 0.90, *RMSEA = 0.030, 90% confidence interval = 0.024, 0.036. Item 13 to Item 7 had the largest standardized residual (LMχ2 = 74.47), and Item 4 to Item 1 had the second largest standardized residual (LMχ2 = 24.39).

These findings showed an adequate fit, and the largest standardized residual occurred differently in the Chinese and Italian samples. This provides evidence to use Kovacs’s (2003) original five-factor structure of the CDI as the baseline model.

Reliability Estimates Using Kovacs’s (2003) original five-factor structure of the CDI as the baseline model, we estimated correlations and Cronbach’s alpha for internal consistency for CDI subscales and total scale in both the Chinese and Italian samples. For the Chinese sample, the correlations between CDI subscales ranged from .34 to .63 (all ps < .001), and the correlations between CDI total scale and subscales ranged from .64 to .89 (all ps < .001). Cronbach’s alpha coefficient for total scale was .87, and for subscales, alphas ranged from .40 to .75 (for detail, please see Table 1). For the Italian sample, the correlations between CDI subscales ranged from .47 to .58 (all ps < .001), and the correlations between CDI total scale and subscales ranged from .72 to .85 (all ps < .001). Cronbach’s alpha coefficient for the total scale was.87, and the alphas for the subscales ranged from .56 to .71 (for details, please see Table 1).

Testing for Measurement Invariance of CDI The first step in the assessment of measurement invariance was to constitute the evaluation of the configural invariance model (see Table 2). As shown in Table 2, both the *RMSEA statistic and *CFI index indicate that configural model fits adequately and provides a reasonable description of the data. In the second step, we assessed the weak invariance model by constraining the factor loadings to equality across groups. Following the method recommended by Satorra and Bentler (2009), all chi-square difference tests were conducted taking into account the scaling correction factor

511

Wu et al. Table 2. Testing for Invariance Across Chinese and Italian Children Samples: Results of Multigroup Confirmatory Factor Analyses Model   1. Configural   2. Weak   3. Partial weak   4. Strong   5. Partial strong   6. Strict   7. Partial strict   8. F actor variances covariance test   9. Partial M8a 10. Test of latent means

SBχ2

df

*RMSEA

90% CI *RMSEA

*CFI

ΔSBχ2

p

Δ*CFI

893.48 946.24 917.54 1349.79 926.29 1184.94 938.83 952.25

578 599 596 621 607 633 622 610

0.02 0.02 0.02 0.03 0.02 0.03 0.02 0.02

0.019, 0.025 0.020, 0.026 0.019, 0.025 0.030, 0.035 0.019, 0.025 0.026, 0.031 0.019, 0.024 0.020, 0.025

0.90 0.89 0.90 0.89 0.90 0.83 0.90 0.89

55.02 (2 vs. 1) 20.44 (3 vs. 1) 584.18 (4 vs. 3) 10.17 (5 vs. 3) 182.25 (6 vs. 5) 15.44 (7 vs. 5) 38.80 (8 vs. 3)