Available online at www.sciencedirect.com

Journal of Business Research 61 (2008) 1203 – 1218

Advancing formative measurement models Adamantios Diamantopoulos ⁎, Petra Riefler 1 , Katharina P. Roth 2 Department of Business Administration, University of Vienna, Bruenner Strasse 72, A-1210 Vienna, Austria Received 1 May 2007; received in revised form 1 November 2007; accepted 1 January 2008

Abstract Formative measurement models were first introduced in the literature more than forty years ago and the discussion about their methodological contribution has been increasing since the 1990s. However, the use of formative indicators for construct measurement in empirical studies is still scarce. This paper seeks to encourage the thoughtful application of formative models by (a) highlighting the potential consequences of measurement model misspecification, and (b) providing a state-of-the art review of key issues in the formative measurement literature. For the former purpose, this paper summarizes findings of empirical studies investigating the effects of measurement misspecification. For the latter purpose, the article merges contributions in the psychology, management, and marketing literatures to examine a variety of issues concerning the conceptualization, estimation, and validation of formative measurement models. Finally, the article offers some suggestions for future research on formative measurement. © 2008 Elsevier Inc. All rights reserved. Keywords: Formative index; Measurement model; Causal indicators

Contents 1. 2. 3. 4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Reflective vs. formative measurement: first-order models . . Higher-order formative models . . . . . . . . . . . . . . . Measurement model misspecification . . . . . . . . . . . . 4.1. Parameter bias due to reversed causality . . . . . . 4.2. Parameter bias due to incorrect item purification . . 4.3. Effects on fit statistics . . . . . . . . . . . . . . . . The status quo of formative measures: issues and proposed 5.1. Conceptual issues . . . . . . . . . . . . . . . . . . 5.1.1. Error-free measures. . . . . . . . . . . . . 5.2. Interpretation of the error term . . . . . . . . . . . 5.3. Estimation of formative models . . . . . . . . . . . 5.3.1. Multicollinearity . . . . . . . . . . . . . . 5.3.2. Exogenous variable intercorrelations . . . . 5.3.3. Model identification . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . remedies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

1204 1204 1205 1208 1208 1210 1210 1211 1211 1211 1211 1212 1212 1212 1213

⁎ Corresponding author. Tel.: +43 1 4277 38031. E-mail addresses: [email protected] (A. Diamantopoulos), [email protected] (P. Riefler), [email protected] (K.P. Roth). 1 Tel.: +43 1 4277 38038. 2 Tel.: +43 1 4277 38040. 0148-2963/$ - see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jbusres.2008.01.009

1204

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

5.4.

Reliability and validity assessment of formative 5.4.1. Reliability assessment . . . . . . . . . 5.4.2. Validity assessment . . . . . . . . . . 6. Conclusion and future research . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .

models . . . . . . . . . . . . . . . . . . . . .

1. Introduction The literature in psychology, management, and marketing pays increasing attention to formative measurement models for operationalizing latent variables (constructs). Researchers in various disciplines have undertaken considerable effort to (a) make the academic community aware of the existence of formative (cause, causal) indicators (e.g., Bollen and Lennox, 1991), (b) demonstrate the potential appropriateness of formative measurement models for a large number of latent constructs (e.g., Diamantopoulos, 1999; Fassot and Eggert, 2005; Fassot, 2006; Jarvis, MacKenzie and Podsakoff, 2003; Venaik, Midgley and Devinney, 2004), (c) reveal consequences of measurement model misspecification (e.g., Diamantopoulos and Siguaw, 2006; Law and Wong, 1999; MacKenzie, Podsakoff and Jarvis, 2005), and (d) develop practical guidelines for the construction of multi-item measures (indexes) comprising formative indicators (e.g., Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003; Giere, Wirtz and Schilke, 2006). Despite the growing number of contributions on formative measurement, however, Bollen's (1989, p. 65) statement still holds true as even a cursory glance in the top management and marketing journals readily reveals, that is, “[M]ost researchers in the social sciences assume that indicators are effect indicators. Cause indicators are neglected despite their appropriateness in many instances”. Two reasons help explain the prevalent lack of applications. On the one hand, a substantial number of researchers engaging in measure development might still be unaware of the potential appropriateness of formative indicators for operationalizing particular constructs (Hitt, Gimeno and Hoskisson, 1998; Podsakoff, Shen and Podsakoff, 2006); indeed “nearly all measurement in psychology and the other social sciences assumes effect indicators” (Bollen, 2002, p. 616). On the other hand, researchers might hesitate to specify formative measurement models because they “are often uncertain how to incorporate them into structural equation models” (Bollen and Davis, 1994, p. 2). Indeed, there are a number of controversial and not fully resolved issues concerning the conceptualization, estimation and validation of formative measures (e.g., see Howell et al., 2007, 2008this issue) including, among others, the treatment of indicator multicollinearity, the assessment of indicator validity, and the interpretation of formatively-measured constructs. This article provides insights into the current state of literature on formative measurement by merging major contributions in the psychology, management and marketing literatures into an overall picture. The overall aim is to encourage the appropriate use of formative indicators in empirical research while at the same time highlighting potentially problematic issues and suggested remedies.

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1215 1215 1215 1216 1216

The section that follows provides a brief conceptual discussion of reflective and formative measurement models. The subsequent section covers the problem of measurement model misspecification, followed by a discussion of its consequences. Next, the article turns attention to a number of critical issues concerning the specification, estimation, and validation of formative measures. Finally, the paper concludes by proposing some directions for future research. 2. Reflective vs. formative measurement: first-order models The assessment of latent variables has a long tradition in social science (e.g., Churchill, 1979; Duncan, 1984; Nunally, 1978). Latent variables are phenomena of theoretical interest which cannot be directly observed and have to be assessed by manifest measures which are observable. In this context, a measurement model describes relationships between a construct and its measures (items, indicators), while a structural model specifies relationships between different constructs (Edwards and Bagozzi, 2000; Scholderer and Balderjahn, 2006). Anderson and Gerbing (1982, p. 453) note that “the reason for drawing a distinction between the measurement model and the structural model is that proper specification of the measurement model is necessary before meaning can be assigned to the analysis of the structural model”. The measurement model (which is of focal interest in this paper) specifies the relationship between constructs and measures. In this respect, the direction of the relationship is either from the construct to the measures (reflective measurement) or from the measures to the construct (formative measurement). The first form of specification, that is, the reflective measurement model (see Fig. 1, Panel 1), has a long tradition in social sciences and is directly based on classical test theory (Lord and Novick, 1968). According to this theory, measures denote effects (or manifestations) of an underlying latent construct (Bollen and Lennox, 1991). Therefore, causality is from the construct to the measures. Specifically, the latent variable η represents the common cause shared by all items xi reflecting the construct, with each item corresponding to a linear function of its underlying construct plus measurement error: x i ¼ ki g þ e i

ð1Þ

where xi is the ith indicator of the latent variable η, εi is the measurement error for the ith indicator, and λi is a coefficient (loading) capturing the effect of η on xi. Measurement errors are assumed to be independent (i.e., cov(εi, εj) = 0, for i ≠ j) and unrelated to the latent variable (i.e., cov(η, εi) = 0, for all i).

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

1205

Fig. 1. Alternative measurement models.

Eq. (1) is a simple regression equation where the observable measure is the dependent variable and the latent construct is the explanatory variable. A fundamental characteristic of reflective models is that a change in the latent variable causes variation in all measures simultaneously; furthermore, all measures in a reflective measurement model must be positively intercorrelated (for a proof, see Bollen, 1984). The second form of specification, that is, the formative measurement model, was first proposed by Curtis and Jackson (1962) who challenge the characteristic of positively correlated measures as a necessary condition. They argue that in specific cases measures show negative or zero correlations despite capturing the same concept. Blalock (1964, 1968, 1971) and Land (1970) subsequently discuss this alternative measurement perspective according to which measures are causes of the construct rather than its effects (see Fig. 1, Panel 2). In other words, the indicators determine the latent variable which receives its meaning from the former. Some typical examples are socio-economic status (Hauser and Goldberger, 1971; Hauser, 1973), quality of life (e.g., Bollen and Ting, 2000, Fayers, Hand, Bjordal and Groenvold, 1997;), or career success (e.g., Judge and Bretz, 1994); Table 1 provides further examples. The formal specification of the formative measurement model is: g¼

n X

gi xi þ f

ð2Þ

i¼1

where γi is a coefficient capturing the effect of indicator xi on the latent variable η, and ζ is a disturbance term. The latter comprises all remaining causes of the construct which are not represented in the indicators and are not correlated to the latter; thus following the assumption that cov(xi,ζ) = 0. Eq. (2) represents a multiple regression equation and in contrast to Eq. (1), the latent variable is the dependent variable and the indicators are the explanatory variables. Diamantopoulos and Winklhofer (2001) point out several characteristics of this model which make it sharply distinct from the reflective model. First, the indicators characterize a set of distinct causes which are not interchangeable as each indicator captures a specific aspect of

the construct's domain (see also Jarvis et al., 2003; and Rossiter, 2002); indeed, omitting an indicator potentially alters the nature of the construct (Bollen and Lennox, 1991). Second, there are no specific expectations about patterns or magnitude of intercorrelations between the indicators; formative indicators might correlate positively or negatively or lack any correlation (for a detailed discussion see Bollen, 1984). Third, formative indicators have no individual measurement error terms, that is, they are assumed to be error-free in a conventional sense (Edwards and Bagozzi, 2000). The error term (ζ) is specified at the construct level (MacCallum and Browne, 1993) and does not constitute measurement error (Diamantopoulos, 2006). Fourth, a formative measurement model, in isolation, is underidentified and, therefore, cannot be estimated (Bollen, 1989; Bollen and Davis, 1994). In contrast, reflective measurement models with three or more indicators are identified and can be estimated (e.g., see Long, 1983). A later section of this paper addresses the estimation of formative models. 3. Higher-order formative models The formative model specified in Eq. (2) is a first-order measurement model (Edwards, 2001). However, constructs are often conceptualized and subsequently operationalized as multidimensional entities (e.g., Brewer, 2007; Lin, Sher and Shih, 2005; Venaik et al., 2004; Yi and Davis, 2003). From a conceptual point of view, a construct is multidimensional “when it consists of a number of interrelated attributes or dimensions and exists in multidimensional domains. In contrast to a set of interrelated unidimensional constructs, the dimensions of a multidimensional construct can be conceptualized under an overall abstraction, and it is theoretically meaningful and parsimonious to use this overall abstraction as a representation of the dimensions” (Law, Wong and Mobley, 1998, p. 741). When dealing with multidimensional constructs, it is necessary to distinguish between (at least) two levels of analysis, that is, one level relating manifest indicators to (first-order) dimensions, and a second level relating the individual dimensions to the (second-order) latent construct (Jarvis et al., 2003; MacKenzie et al., 2005). Failing to carefully specify the latter relationships, “one cannot derive the overall

1206

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Table 1 Examples of formatively-measured constructs Author(s)

Journal

Formative construct(s)

Estimation method

Customer perceived value

SEM a

Hyman et al. (2002) Sánchez-Pérez and Iniesta-Bonillo (2004)

International Journal of Service Industry Management Journal of Marketing Theory and Practice Journal of Business and Psychology

Household affluence Consumers' commitment towards retailers

MIMIC model MIMIC model

Information technology literature Brock and Zhou (2005) Pavlou and Gefen (2005)

Internet Research Information Systems Research

SEM (PLS) a SEM (PLS) a

Santosa et al. (2005)

European Journal of Information Systems

Yi and Davis (2003)

Information Systems Research

Organizational internet use Psychological contract violation Perceived effectiveness of institutional structures Intrinsic motivators Situational motivators Observational learning

Management literature Helm (2005) Venaik et al. (2005)

Corporate Reputation Review Journal of International Business Studies

Witt and Rode (2005)

Journal of Enterprising Culture

Dowling (2004)

Corporate Reputation Review

Venaik et al. (2004) Johansson and Yip (1994)

Management International Review Strategic Management Journal

Marketing literature Bruhn et al. (2008-this issue) Cadogan et al. (2008-this issue)

Consumer behavior literature Lin et al. (2005)

SEM (PLS) a SEM (PLS) a

Firm reputation Environmental controls: – Local government regulatory influence – Quality of local business infrastructure – Pressures of global competition – Pressures from technological change Corporate identity Corporate culture Corporate descriptors Corporate reputation Firm pressures Industry drivers Organization structure Management process Global strategy

SEM (PLS) a SEM (PLS) a

In this Special Issue In this Special Issue

Customer equity management Quality of market-oriented behaviors

Brewer (2007) Collier and Bienstock (2006) Johnson et al. (2006)

Journal of International Marketing Journal of Service Research Journal of Advertising

Ulaga and Eggert (2006)

Journal of Marketing

Psychic distance e-service quality Perceived interactivity: – Reciprocity – Responsiveness – Nonverbal information – Speed of response Relationship value

SEM (PLS, LISREL) MIMIC model (LISREL) n.a. SEM (AMOS) a SEM (EQS) a

Reinartz et al. (2004) Arnett et al. (2003) Homburg et al. (2002) Winklhofer and Diamantopoulos (2002) Homburg et al. (1999)

Journal of Marketing Research Journal of Retailing Journal of Marketing International Journal of Research in Marketing Journal of Marketing

a

CRM process implementation Retailer equity Service orientation Sales forecasting effectiveness Marketing's influence Market-related complexity

SEM (PLS) a Regression model SEM (PLS) a SEM (PLS) a

SEM with summated dimension scores (PLS) MIMIC model MIMIC model Composite score MIMIC model SEM (LISREL) a

Identification achieved through linkage to two or more reflective constructs.

construct from its dimensions and can only conduct research at the dimensional level, even though these dimensions are claimed theoretically to be under an overall construct“ (Law et al., 1998, p. 741). Since for each level both formative or reflective specifications are applicable, Jarvis et al. (2003) identify four different types of multidimensional constructs, namely, (a) formative first-order and formative second-order (synonyms for this model are “aggregate model”, “composite model”, “emergent model” and “indirect formative model”; e.g.,

see Cohen et al., 1990; Edwards and Bagozzi, 2000; Giere et al., 2006; Law et al., 1998; Law and Wong, 1999), (b) reflective first-order and formative second-order, (c) formative first-order and reflective second-order, and (d) reflective first-order and reflective second-order models (synonyms for this type of model are “latent model”, “factor model”, “superordinate construct”, “indirect reflective model” and “second-order total disaggregation model”; see Bagozzi and Heatherton, 1994; Edwards, 2001; Edwards and Bagozzi, 2000; Giere et al., 2006;

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Law et al., 1998). Since this review focuses on formative measurement, this section only briefly discusses the former three types of multidimensional constructs (see Fig. 2). The first model in Fig. 2 (Type I) conceptualizes the multidimensional construct as a composite of its dimensions such that the arrows point from the dimensions to the construct (Williams, Edwards and Vandenberg, 2003). The dimensions are thus analogous to formative measures; however, in contrast to the traditional conceptualization of formative measures as observed variables (see Eq. (2)), the dimensions are themselves constructs and conceived as specific components of the secondorder construct (Edwards, 2001). In this type of model, the error term exists both at the level of the individual (first-order) dimensions and at the overall construct level. Table 1 provides a number of empirical illustrations of Type I formative multidimensional constructs (e.g., Arnett, Laverie, and Meiers, 2003; Brewer, 2007; Reinartz, Krafft and Hoyer, 2004; Venaik et al.,

1207

2004; Venaik, Midgley and Devinney, 2005; Witt and Rode, 2005; Yi and Davis, 2003; see also Bruhn et al., 2008-this issue). For example, Yi and Davis' (2003) construct of “observational learning processes” comprises four formative firstorder dimensions, namely “attention processes”, “retention processes”, “production processes”, and “motivation processes”. The second type of model shown in Fig. 2 (Type II) represents a second-order construct with first-order formative dimensions which are themselves measured by several reflective manifest items. According to this conceptualization, the error term exists at two different levels, namely (a) at the level of the manifest indicators, where it represents measurement error, and (b) at the level of the second-order construct, where it captures the amount of variance in the second-order construct which the first-order dimensions do not account for. As Type II models have been introduced rather recently, only most recent literature provides empirical examples of its use (e.g., Johnson, Bruner and Kumar

Fig. 2. Higher-order formative models (adapted from Jarvis et al., 2003, p. 205).

1208

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

et al., 2006; Lin et al., 2005; see also Ruiz et al., 2008-this issue). For example, Lin et al. (2005) conceptualize the construct of “customer perceived value” as a second-order factor which is formed by five reflectively specified first-order dimensions, namely “monetary sacrifice”, “website design”, “fulfillment/ reliability”, and “security/privacy”. The third model illustrated in Fig. 2 (Type III) has first-order factors as reflective dimensions, but the first-order dimensions themselves have formative indicators. For this reason, the error term exists at the level of the first-order dimensions only and represents both the variance not explained by the manifest indicators (due to the formative specification of the first-order dimensions) and variance not explained by the underlying (higher-order) construct. Although Jarvis et al. (2003) include this model in their typology, literature has not explicitly recognized this kind of model, and empirical examples remain virtually non-existent. The reasons for this are threefold (see also Albers and Götz, 2006). First, as noted above, the nature of the error term is difficult to interpret due to the endogenous position of the formative first-order dimensions. Second, formative indicators capture different facets of a construct and are therefore not interchangeable (Diamantopoulos and Winklhofer, 2001). These indicators give the first-order dimensions their meaning, which, by definition, has to be different for each dimension because of the formative specification (Rossiter, 2002). Since a reflective specification at the second-order level implies that the dimensions are manifestations of a second-order construct, it is unclear whether the meaning of the dimensions is attributable to the formative indicators or to the underlying common cause. Third, Type III models cannot be estimated using current procedures for achieving identification of formative constructs (see section on model identification later in this paper). In short, Type III models do not represent an appealing option for specifying multidimensional constructs. 4. Measurement model misspecification A number of researchers criticize the prevalent neglect of explicit measurement model specification underlying scale construction efforts (Diamantopoulos and Winklhofer, 2001; Eberl, 2006; Fassot, 2006; Fassot and Eggert, 2005; Fornell and Bookstein, 1982; Jarvis et al., 2003; Podsakoff et al., 2006). Most researchers apply scale development procedures without even questioning their appropriateness for the specific construct at hand (see also Albers and Hildebrandt, 2006; Williams et al., 2004; for a noteworthy exception see Eberl and Schweiger, 2005); indeed, Diamantopoulos and Winklhofer (2001, p. 274) speak of an “almost automatic acceptance of reflective indicators”. Consequently, misspecification commonly concerns the adoption of reflective indicators where formative indicators (and thus index construction approaches) would be appropriate (which is a Type I error in Diamantopoulos and Siguaw's (2006) terminology). The other case of misspecification, that is, the incorrect adoption of a formative model where indeed a reflective would be appropriate (Type II error), is rather negligible (Fassot, 2006; Jarvis et al., 2003). An explanation for this difference in evidence of Type I and Type II errors is the

fact that standardized development procedures for reflective scales have been established over the years (e.g., see Churchill, 1979; DeVellis, 2003; Netemeyer, Bearden and Sharma, 2003; Spector, 1992), whereas concrete guidelines for the construction of formative indixes have been proposed very recently (Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003; Giere et al., 2006). Jarvis et al. (2003) assess the degree of misspecification for studies published in four major marketing journals (Journal of Marketing, Journal of Marketing Research, Marketing Science, and Journal of Consumer Research). Even though they apply a conservative evaluation approach (i.e., classifying operationalizations as correct in case that either reflective or formative measures could in general apply), they find about a third of all studies to be subject to measurement model misspecification. Fassot (2006) applies Jarvis et al.'s (2003) approach to three major German management journals (Zeitschrift für Betriebswirtschaft, Zeitschrift für betriebswirtschaftliche Forschung, Die Betriebswirtschaft) and reports similar results (i.e., 35% of all investigated studies include misspecified constructs). In a similar effort, Fassot and Eggert (2005) calculate a misspecification rate of some 80% for a major German marketing journal (Marketing ZFP). This problematic situation is not unique to marketing literature. In similar efforts, Podsakoff et al. (2006) reveal inappropriate modeling for 62% of constructs published in three major strategic management journals (Academy of Management Journal, Administrative Science Quarterly, Strategic Management Journal), while Podsakoff, MacKenzie, Podsakoff and Lee (2003) report a misspecification rate of 47% for leadership research (including publications in The Leadership Quarterly, Journal of Applied Psychology, and again Academy of Management Journal). Given this documented existence of measurement model misspecification, the obvious question is to what extent misspecification does impact on model estimates and fit statistics. This question is important because “any bias in the estimates […] could affect the conclusions about the theoretical relationships among the constructs drawn from the research” (Jarvis et al., 2003, p. 207). The literature review identifies six studies empirically investigating the consequences of measurement model misspecification. Table 2 categorizes these studies along two characteristics. The first characteristic refers to the source of bias investigated, which is either (a) the wrongly specified direction of causality between a given set of indicators and a construct, or (b) the application of an inappropriate item purification procedure (i.e., purifying formative indicators according to guidelines applicable for reflective indicators). The second characteristic refers to the position of the focal misspecified construct in the structural model, which is either exogenous or endogenous. A discussion of the findings of these studies follows. 4.1. Parameter bias due to reversed causality Jarvis et al. (2003), Law and Wong (1999), and MacKenzie et al. (2005) examine the impact of incorrect causal direction, that is, the specification of a reflective measurement model

Table 2 Empirical studies on consequences of measurement model misspecification Technique

Model fit b

Exogenous construct misspecified

Endogenous construct misspecified

Exogenous construct misspecified

Overestimation

CFI ≈ NFI ≈ NNFI ≈ IFI ≈ TLI ≈ χ2 / df ↑ Over- and CFI ↓ underestimation of RMSEA ↑ some parameters χ2 / df ↑

Data set

Law and Wong (1999)

Reversed causality

Survey data

Edwards (2001)

Reversed causality

Published SEM (LISREL, Underestimation c covariance matrices RAMONA) of survey data

Jarvis et al. (2003)

Reversed causality

Simulated data

Monte Carlo simulation

MacKenzie et al. (2005)

Reversed causality

Simulated data

Monte Carlo simulation

SEM (RAMONA)

Albers and Reversed causality and Simulated data Hildebrandt (2006) incorrect indicator purification

SEM (PLS, LISREL)

Diamantopoulos and Siguaw (2006)

Regression analysis

a b c d

Incorrect indicator purification Survey data

Not tested

CFI ≈ GFI ↑ RMSEA ≈ SRMR ≈ χ2 / df ↑ Overestimation Underestimation CFI ↓ (on average: 429%) (on average: 84%) GFI ≈ RMSEA ↑ SRMR ≈ No bias Not tested Not given (stated that fit indices were similarly good) Underestimation d Not tested CFI ≈ GFI ≈ RMSEA ↑ NNFI ≈ χ2 / df ↑

Overestimation (335% to 555%)

Additional findings Endogenous construct misspecified Not tested

Also biases in model relationships which do not involve the misspecified construct

Not comparable (df = 0 and perfect fit of formative models)

Concluded that both, the multidimensional formative and reflective specification were inferior to a multivariate structural model Item correlation found to be negatively related to magnitude of estimation bias

CFI ↓ GFI ↓ RMSEA ↑ SRMR ↑ Not tested

Type II error increases if endogenous or both constructs are misspecified

Underestimation (88% to 93%)

Not tested

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Structural parameter estimates a

Focus (reason for estimation bias)

Unstandardized parameter estimates. ≈ Goodness-of-fit index for reflective and formative model similar (difference +/−.05). Edwards (2001) estimates several second-order models, this comparison concerns the Congeneric and Estimated Loadings Models. R-squares compared.

1209

1210

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

when a formative model is conceptually appropriate, for exogenous latent variables. All three studies reveal an overestimation of structural parameters when the latent variable is affected by misspecification. In some cases, the (incorrect) reflective specification even yields a significant parameter estimate, whereas the parameter estimate is not significant in the (correct) formative specification. Thus, the impact of the focal latent variable on other constructs in the structural model tends to be overestimated. Jarvis et al. (2003) and MacKenzie et al. (2005) additionally examine the impact of incorrect specifications of endogenous latent variables. In contrast to the exogenous case, both studies report an underestimation of the parameter estimate capturing the impact of antecedent variables on the focal construct. An explanation for these distinct findings of under- and overestimation for endogenous and exogenous positions, respectively, is the difference in portions of variance accounted for by reflective and formative operationalizations. Specifically, a reflective treatment of a formative construct reduces the variance of the construct (see Fornell, Rhee and Yi, 1991 or Namboodiri, Carter and Blalock, 1975) because the variance of a reflectively-measured construct equals the common variance of its measures, whereas the variance of a formatively-measured construct encompasses the total variance of its measures (Law and Wong, 1999). Consequently, if a misspecification reduces the variance of the exogenous variable while the level of the variance of the endogenous variable is maintained, the parameter estimate for their relationship increases. In contrast, if a misspecification reduces the variance of the endogenous variable while the variance of the exogenous variable is unchanged, the relevant structural parameter estimate decreases. In any case, these analyses reveal that structural paths are either overestimated or underestimated as a result of measurement model misspecification with undesirable effects on the substantive interpretation of the structural model relationships. 4.2. Parameter bias due to incorrect item purification To fully capture the meaning of a formatively-measured construct, a census of indicators is (ideally) required because “[o]mitting an indicator is omitting a part of the construct” (Bollen and Lennox, 1991, p. 308). Therefore, an omission of indicators is equivalent to restricting the domain of the construct (MacKenzie et al., 2005). In the context of index construction, this characteristic implies that the elimination of formative items from the item pool has to be theoretically justified rather than purely based on statistical properties (Diamantopoulos and Winklhofer, 2001; Diamantopoulos and Siguaw, 2006). Indeed, “internal-consistency checks on cause-indicators may lead researchers to discard valid measures improperly” (Bollen, 1984, p. 381) and “following standard scale development procedures – for example dropping items that possess low itemto-total correlations – will remove precisely those items that would most alter the empirical meaning of the construct” (Jarvis et al., 2003, p. 202; see also MacKenzie, 2003). In light of the extensive presence of measurement model misspecification discussed earlier, recent studies examine the

consequences of applying conventional scale development procedures on formative measures. Fassot (2006) provides an example of a misspecified measure that leads to a neglect of a key aspect of the focal construct. More specifically, “perceived friendliness of the staff” is erroneously dropped from a measure of hospital quality due to not meeting conventional standards for reflective items (i.e., high item-total correlations), however, despite being a key aspect of a hospital's quality assessment. In line with this example, Diamantopoulos and Siguaw (2006) find that the same initial item pool results in considerably differing final item sets under reflective and formative purification guidelines respectively. The former approach eliminates items with low inter-item correlations, whereas the latter drops items with high inter-item correlations (thus causing problems of multicollinearity). In Diamantopoulos and Siguaw's (2006) example, the resulting scale and index share not more than two out of 30 initial items. Their study therefore demonstrates how erroneous reflective scale purification processes can substantially alter the meaning of formative constructs. Albers and Hildebrandt (2006) address the issue of parameter estimation bias due to incorrect indicator purification. First, the authors compare parameter estimates of a reflectively and a formatively specified measurement model using the full item set prior to purification. Second, they compare two formatively specified models, once using the full item pool (i.e., in accordance with the requirement of a census of items) and once using a reduced item set following purification guidelines for reflective scales. The latter comparison reveals an extensive underestimation of structural parameters, while the former shows no significant differences. Therefore, in this example, it is the erroneous purification rather than the causal order misspecification that impacts on parameter bias. 4.3. Effects on fit statistics The studies in Table 2 also examine the impact of misspecification on goodness-of-fit indices for the overall (i.e., measurement and structural) model. An intuitive expectation is that the consequences of misspecification in terms of changed construct meanings and biased parameter estimates would also lead to poor model fit. However, the majority of models incorporating misspecified constructs show highly acceptable values for CFI, GFI, SRMR and RMSEA. Moreover, these values are similar to the goodness-of-fit values obtained for the corresponding correctly specified model. For example, MacKenzie et al. (2005, p. 724) conclude from their study that “each of the four goodness-of-fit indices failed to detect the misspecification of the measurement model”. This equally applies to all other studies listed in Table 2. Only the chi-square (per degree of freedom) statistic shows to be consistently higher in the wrongly reflectively specified models throughout the studies, thus providing some indication of the underlying misspecification. Summarizing, all studies empirically examining the consequences of measurement model misspecification on parameter estimates report serious under- or overestimation of parameters as a consequence of misspecified causality, wrongly

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

adopted purification procedures, or a combination of both. Such biases may in turn lead to incorrect conclusions on tested relationships, thus putting many empirical results into question. Especially alarming is the fact that a satisfactory overall model fit does not guarantee a correct specification and that misspecifications are not detected by poor fit index values. It is to hope that the empirical demonstration of the undesirable consequences of measurement model misspecification will yield more echo to Bagozzi's (1984) and Jarvis et al.'s (2003) call to conceptually justify measurement relationships as hypotheses and subsequently test them empirically. 5. The status quo of formative measures: issues and proposed remedies As the introductory section already outlines, literature has only recently started to pay serious attention to formative measurement models and empirical applications are still rare. As a result, experience with formative measures is limited and several conceptual and practical issues are not fully clarified yet. The following sections discuss such issues and highlight various (sometimes contradicting) views of proposed remedies. 5.1. Conceptual issues 5.1.1. Error-free measures Formative measurement models incorporate the error term at the construct level and specify individual indicators to be errorfree (see Eq. (2) earlier). Some researchers find this nonexistence of measurement error hard to accept. Edwards and Bagozzi (2000), for example, regard such an assumption as untenable in most situations. Addressing this objection, a model such as the one depicted in Fig. 3 is worth considering as one way of incorporating measurement error to formative measurement models. This model is similar to Edwards and Bagozzi's (2000) “spurious model” with multiple common causes, with the only difference that the latent variables are intentionally introduced to enable the accommodation of measurement error in the indicators. This model inserts a latent variable ξi for each formative indicator xi so that the focal latent variable η is indirectly linked to the indicators xi via the latent exogenous constructs ξi. Doing

Fig. 3. Modified formative model with individual error terms.

1211

so, each formative indicator becomes a (single) reflective measure of its respective latent variable ξi and consequently comprises an error term; hence, the assumption of error-free indicators is relaxed. Although this model has the substantial advantage of incorporating measurement error, its conceptual justification is questionable for several reasons. First, the inclusion of the firstorder constructs ξi introduces a “fictitious” level, which adversely affects model parsimony and suggests that a latent variable can more or less automatically be specified for any manifest variable. Second, given that the xi are not directly linked to η, they cannot be legitimately considered to be indicators of η because indicators need to be linked by means of a direct relationship to the construct they assess. Third, the measures of the ξi in Fig. 3 are single-indicators with all drawbacks such indicators entail (such as high specificity, and low reliability). As a discussion of potential problems with single item measures is beyond the scope of this paper, the reader is referred to Gardner, Cummings, Dunham and Pierce (1998) and Nunnally and Bernstein (1994) for further details. 5.2. Interpretation of the error term Eq. (2) and Fig. 1 (Panel 2) show that a formative measurement model specification includes an error (disturbance) term at the construct level. This error term represents the surplus meaning of the construct (Jarvis et al., 2003; Temme 2006) which is not captured by the set of formative indicators included in the model specification. Diamantopoulos (2006, p. 7) points out that “previous discussions of the error term are often problematic and fail to provide […] a clear interpretation of exactly what the error term represents”. Jarvis et al. (2003), for example, describe the error term as the collective (i.e., overall) random error of all formative indicators taken as a group, while MacKenzie et al. (2005, p. 712) interpret the error estimate as capturing “the invalidity of the set of measures — caused by measurement error, interactions among the measures, and/or aspects of the construct domain not represented by the measures”. However, the first source of error, that is, measurement error, is conceptually incorrect. Diamantopoulos (2006) demonstrates that the error term does not represent measurement error because formative indicators are specified to be error-free and, therefore, measurement error cannot be included in the error term at the construct level. The second source, that is, measure interactions, is statistically plausible but lacks substantive interpretation. Since formative indicators determine the meaning of the latent variable, it is not possible to separate the construct's meaning from the indicators' content (Diamantopoulos, 2006). If two indicators show interaction effects, these effects would also form the construct's meaning as both indicators separately do. The third source, is indeed the correct interpretation of the nature of the error term, that is, aspects of the construct domain not represented by the indicators. Specifically, “the error term in a formative measurement model represents the impact of all remaining causes other than those represented by the indicators included in the model” (Diamantopoulos, 2006, p. 11). Formative latent variables have a number of proximal causes which researchers try to

1212

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

identify when conceptually specifying the construct. However, in many cases researchers will be unable to detect all possible causes as there may be some which have neither been discussed in prior literature nor revealed by exploratory research. The constructlevel error term represents these missing causes. This means that the more comprehensive the set of formative indicators specified for the construct, the smaller the influence of the error term. Williams et al. (2003, p. 908) note in this context that “as the variance of the residual increases, the meaning of the construct becomes progressively ambiguous”. 5.3. Estimation of formative models 5.3.1. Multicollinearity Multicollinearity is an undesirable property in formative models as it causes estimation difficulties (Albers and Hildebrandt, 2006; Diamantopoulos and Winklhofer, 2001). These estimation problems arise because a multiple regression links the formative indicators to the construct (see Eq. (2)). Substantial correlations among formative indicators result in unstable estimates for the indicator coefficients γi and it becomes difficult to separate the distinct influence of individual indicators on the latent variable η. Diamantopoulos and Winklhofer (2001) further note that multicollinearity leads to difficulties in assessing indicator validity on the basis of the magnitude of the parameters γi (Bollen, 1984; MacKenzie et al., 2005). Literature proposes different approaches for dealing with multicollinearity. Bollen and Lennox (1991) argue that indicators which highly intercorrelate are almost perfect linear combinations and thus quite likely contain redundant information. Based on this view, several authors (e.g., Diamantopoulos and Winklhofer, 2001; Götz and Liehr-Gobbers, 2004) suggest indicator elimination based on the variance inflation factor (VIF), which assesses the degree of multicollinearity. Some empirical studies on formative measure development (e.g., Diamantopoulos and Siguaw, 2006; Helm, 2005; Sánchez-Pérez and Iniesta-Bonillo, 2004; Witt and Rode, 2005) follow this advice usually by applying the commonly accepted cut-off value of VIF N 10 or its tolerance equivalent (see Giere et al., 2006; Hair, Anderson, Tatham and Black, 1998; Kennedy, 2003). However, considering that this multicollinearity check leads to indicator elimination on purely statistical grounds and given the danger of altering the meaning of the construct by excluding indicators (Bollen and Lennox, 1991), “[i]ndicator elimination – by whatever means – should not be divorced from conceptual considerations when a formative measurement model is involved” (Diamantopoulos and Winklhofer, 2001, p. 273). Albers and Hildebrandt (2006) put forward a different approach for overcoming multicollinearity by combining formative indicators into an index (using either an arithmetic or geometric mean) and using the latter as a single-item construct in the subsequent analysis. However, although intuitively appealing, this suggestion raises two important questions. First, what is the interpretation of the joint index of two indicators in terms of its substantial meaning? If, for example, income and age show a high intercorrelation (which appears to be a likely assumption) and their measures are

consequently combined into an index, what exactly does this index capture? Second, having included this index into Eq. (2), what kind of information does its corresponding regression parameter estimate provide? Is it the impact of a joint unit change in both, the income and the age? 5.3.2. Exogenous variable intercorrelations One general issue when specifying measurement models is the specification of inter-indicator correlations. In reflective models, a common approach is to free all covariances among exogenous variables allowing for intercorrelations. In formative models, following this strategy leads to a large number of additional parameters, namely correlation estimates of covariances between (a) formative indicators within a construct, (b) formative indicators between constructs, and (c) exogenous latent constructs (Bollen and Lennox, 1991; MacCallum and Browne, 1993). Bollen and Lennox (1991) recommend allowing for intercorrelation of formative measures which relate to the same construct (without, however, expecting any specific pattern). Furthermore, they argue that for both reflectively and formatively-measured constructs it is likely, but though not necessary, that item correlations within constructs exceed item correlations between constructs. Based on this argument, MacCallum and Browne (1993) consider two possible approaches of specifying correlations. The first approach specifies formative indicators of the same construct to be correlated with each other but uncorrelated with indicators of other constructs. The obvious advantage of this procedure is that it retains model parsimony as no nonhypothesized paths are added. The obtained goodness-of-fit indices are hence solely based on the hypothesized relationships, that is, the relationships of interest. The shortcoming of this approach, however, is that the fixing of covariances to zero leads to blocks of zeros in the implied covariance matrix. These zero covariances assume that the corresponding indicators and/ or latent variables are perfectly uncorrelated. MacCallum and Browne (1993) note that this assumption implies substantive meaning for the model which requires theoretical justification. They therefore refrain from recommending this approach. Jarvis et al. (2003) further argue that any common cause of the concerned variables that is not incorporated in the model contributes to a lack of model fit. Consequently, they also conclude that fixing covariances to zero is an inappropriate method. The second approach specifies formative indicators to be correlated with each other as well as with indicators of other constructs or exogenous variables. The major advantage of this method is that all variables are allowed to covary instead of assuming complete independence which is theoretically not justifiable. This approach, however, also raises a number of problematic issues. First, the number of parameters to be estimated increases, thereby decreasing the number of degrees of freedom. Second, MacCallum and Browne (1993) empirically show that the additional parameters provide little explanatory value. Consequently, models lack parsimony without providing substantive meaning in explaining inter-measure

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

relationships. Furthermore, the estimates for unhypothesized parameters influence the overall model fit, even though they are not of interest. Despite these shortcomings, MacCallum and Browne (1993) recommend this method compared to the option of having zero blocks in the implied covariance matrix. Jarvis et al. (2003) agree on that fact but stress that locating the impact of the non-hypothesized parameter estimates on the model fit is necessary. They suggest estimating a series of nested models, that is, freeing parameters step by step and comparing the overall model fit across steps. 5.3.3. Model identification A major concern of formative measurement models is how to establish statistical identification to enable their estimation. In isolation, formatively-measured constructs as defined by Eq. (2) are underidentified (Bollen and Lennox, 1991; MacCallum and Browne, 1993; Temme, 2006) and, thus, cannot be estimated. This inability to estimate formative measurement models without the introduction of additional information (see below) has resulted in criticisms of the value of formative measurement in general (see Howell et al., 2007, 2008-this issue). As with reflective measurement models, two necessary but yet not sufficient conditions have to be met for identifying models including formatively-measured constructs (Bollen, 1989; Bollen and Davis, 1994; Cantallupi, 2002a,b; Edwards, 2001; Temme, 2006). First, the number of non-redundant elements in the covariance-matrix of the observed variables needs to be greater or equal the number of unknown parameter in the model (t-rule). Second, the latent construct needs to be scaled (scaling rule). For the latter condition, three main options are available (Bollen and Davis, 1994; MacCallum and Browne, 1993), namely (a) fixing a path from a formative indicator to the construct, (b) fixing a path from the formatively-measured construct to a reflectivelymeasured endogenous latent variable, or (c) standardizing the formatively-measured construct by fixing its variance to unity. Edwards (2001) advises the last of the three options because fixing path parameters precludes estimating standard errors of theoretically interesting relationships. Note that the choice of scaling method can affect substantive conclusions as the significance of different relationships in the model with a formatively-measured construct may vary depending of how the scale of the latter is set (see Franke et al., 2008-this issue). The t-rule and scaling rule are, however, not sufficient conditions for identifying formative measurement models. In this context, Bollen (1989) draws attention to the fact that the formative measurement model needs to be placed within a larger model that incorporates consequences (i.e., effects) of the latent variable in question to enable its estimation. Specifically, for identifying the disturbance term ζ at the construct level, the formative latent variable needs to emit at least two paths to other (reflective) constructs or indicators (MacCallum and Browne, 1993); literature also refers to this condition as 2+ emitted paths rule (Bollen and Davis, 1994). Literature discusses three approaches for applying the 2+ emitted paths rule, which are (a) adding two reflective indicators to the formatively-measured construct, (b) adding two reflectively-measured constructs as outcome variables, and (c) a mix-

1213

ture of these two approaches, that is, adding a single reflective indicator and a reflectively-measured construct as an outcome variable. 5.3.3.1. Adding two reflective indicators. The first option is adding two reflective measures to the set of formative indicators (see Fig. 4). Jarvis et al. (2003) and MacKenzie et al. (2005) advise this method based on the key arguments that (a) this approach does not require adding constructs to the model solely for identification purposes (which contributes to model parsimony), and (b) measurement parameters are stable and less sensitive to changes in structural parameters. However, this model allows for different conceptual interpretations (Jarvis et al., 2003), namely (a) a MIMIC model (Jöreskog and Goldberger, 1975), (b) an endogenous construct with two reflective indicators that is influenced by exogenous observed variables, or (c) a formatively-measured construct which influences indicators of another construct. MacKenzie et al. (2005) argue that the constellation resulting from adding two reflective measures to the formative specification should not be interpreted as a MIMIC model but as a latent variable having a mixture of formative and reflective indicators (since both types of indicators belong to the same concept domain and are content-valid operationalizations of the same construct). In contrast, MacCallum and Browne (1993), Scholderer and Balderjahn (2006) and Temme (2006) explicitly equate models with mixed indicators and MIMIC models. It is outside the scope of this paper to discuss these interpretations in detail, but it should be stressed that despite the different possible interpretations at a conceptual level, there are no differences at the empirical level (the models yield the same parameter estimates).

Fig. 4. Identification using a MIMIC model.

1214

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Fig. 5. Identification with two reflectively-measured constructs.

5.3.3.2. Adding two reflective constructs. According to Bollen and Davis (1994), another option of establishing model identification is the specification of two structural relations from the formative latent variable to two reflectively-measured constructs (Fig. 5). While these two reflectively-measured constructs need to be unrelated in models which only comprise the focal formatively-measured construct and the two reflectivelymeasured constructs (as in Fig. 5), the reflective constructs may be causally related in larger models (Temme, 2006). This model is justifiable in case that two reflectivelymeasured constructs can be included in the nomological network based on theoretical considerations. However, including any reflectively-measured outcome variables purely for identification reasons puts the theoretical model specification into question if these outcomes are not of theoretical interest. Note also that the choice of outcome constructs potentially affects the interpretation of the formatively-measured construct itself by influencing the estimates of γ-parameters (see Heise, 1972; Howell et al., 2007; and also Franke et al., 2008-this

issue). Indeed, as Bagozzi (2007, p. 236) observes, “the parameters regarding the observed variables to their purported formative latent variable are functions of the number and nature of endogenous latent variables and their measures”. 5.3.3.3. Adding one reflective indicator and one reflective construct. This model is a mixture of the two previous procedures and involves adding one reflective indicator to the latent construct and linking the latter to a reflectively-measured latent variable (Fig. 6). This mixed approach is applicable if the theoretical model includes only one structural relationship of the formatively-measured latent variable to a reflectivelymeasured latent variable. In this case, including a reflective indicator such as a global measure helps to overcome underidentification and might simultaneously be used for validation purposes (see Diamantopoulos and Winklhofer, 2001). Temme (2006) demonstrates that the 2+ emitted paths rule is a necessary but yet not necessarily sufficient condition for identification when the two reflectively-measured outcome

Fig. 6. Identification with one reflective measure and one reflectively-measured construct.

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

constructs are either directionally related (i.e., one directly impacts on the other), or their disturbance terms are correlated. These models require imposing further restrictions in order to establish full model identification (such as fixing the covariance of the disturbance terms to zero, or using a partially reduced form model; for details, see Bollen and Davis, 1994; Cantaluppi, 2002a,b; and Temme, 2006). Finally, models which violate the 2+ emitted paths rule due to containing formatively-measured constructs that emit only one path can be identified by fixing the variance of the disturbance term to zero (MacCallum and Browne, 1993). MacCallum and Browne (1993) alert to applying this approach with caution as it implies the theoretical assumption that the formative indicators completely capture the construct. In other words, this approach assumes that a census of indicators of the latent variable is undertaken at the item generation stage, and, hence, no unexplained variance exists. By fixing the disturbance term to zero the formative construct becomes a weighted linear combination of its indicators without any surplus meaning (Diamantopoulos, 2006; MacKenzie et al., 2005). Although there are examples of constructs for which all possible indicators could be conceivably specified (Diamantopoulos, 2006), in most cases this assumption is not reasonable (Bollen and Davis, 1994) and therefore setting the error term to zero is not justifiable. Finally, like all formative measurement models, also the three higher-order models in Fig. 2 are in isolation statistically underidentified and cannot be estimated. Since a discussion of necessary conditions for identifying higher-order models is beyond the scope of this review, the reader is referred to Albers and Götz (2006), Cantaluppi (2002a,b), Edwards (2001), Giere et al. (2006), Jarvis et al. (2003), Temme (2006), and Williams et al. (2003). 5.4. Reliability and validity assessment of formative models 5.4.1. Reliability assessment As the correlations between formative indicators may be positive, negative or zero (Bollen, 1984; Diamantopoulos and Winklhofer 2001), reliability in an internal consistency sense is not meaningful for formative indicators (Bagozzi, 1994; Hulland, 1999). As Nunally and Bernstein (1994) put it, “internal consistency is of minimal importance because two variables that might even be negatively correlated can both serve as meaningful indicators of a construct”. Similarly, Bollen and Lennox (1991) explicitly alert researchers not to rely on correlation matrices for indicator selection as this might lead to eliminating valid measures. While Rossiter (2002, p. 388) condemns all sorts of reliability assessments claiming that “for a formed attribute, there is […] no question of unreliability” and several other authors skip the issue of reliability assessment when discussing formative measure development (e.g., Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003), Bagozzi (1994) and Diamantopoulos (2005) recommend reliability assessment for formative indicators in form of test-retest reliability (see e.g., DeVellis, 2003; Spector, 1992). MacKenzie et al. (2005) additionally propose using the correlation between formative

1215

indicators and an alternative measure assessing the focal construct. What needs to be clarified, however, is how such a correlation should be interpreted. Would a non-significant correlation unambiguously mean that the focal measure lacks reliability? What if the alternative measure is itself unreliable? Does this approach actually test the reliability of the focal measure or is it rather a test of convergent validity? 5.4.2. Validity assessment One of the most controversial issues in formative measurement literature is validity assessment. Some researchers argue that no quantitative quality checks are usable for assessing the appropriateness of formative indices (e.g., Homburg and Klarmann, 2006). Others note that the applicability of statistical procedures is limited as the choice of formative indicators determines the conceptual meaning of the construct (Albers and Hildebrandt, 2006). Rossiter (2002, p. 315) dismisses any validity assessment for formative indicators claiming that “all that is needed is a set of distinct components as decided by expert judgment”. However, most researchers do not share the above views. Edwards and Bagozzi (2000, p. 171), for example, stress that “if measures are specified as formative, their validity must still be established. It is bad practice to […] claim that one's measures are formative, and do nothing more”. 5.4.2.1. Individual indicator validity. Bollen (1989) argues that the γ-parameters, which reflect the impact of the formative indicators on the latent construct (see Eq. (2)), indicate indicator validity. The γ-parameters capture the contribution of the individual indicator to the construct, therefore items with nonsignificant γ-parameters should be considered for elimination as they cannot represent valid indicators of the construct (assuming that multicollinearity is not an issue). Diamantopoulos and Winklhofer (2001) build upon Bollen's (1989) argument and recommend using a MIMIC model due to simultaneously allowing for the estimation of γ-parameters and for the provision of an overall model fit (which is indicative of the validity of formative indicators as a set). An alternative (or additional) approach is assessing indicator validity by estimating the indicators' correlations with an external variable. For example, Diamantopoulos and Winklhofer (2001) suggest including a global measure summarizing the essence of the construct (see also Fayers et al., 1997). Assuming that the overall measure is a valid criterion, the relationship between a formative indicator and the overall measure indicates indicator validity (Eggert and Fassot, 2003; MacKenzie et al., 2005). Following this approach, indicators correlating highly with the external variable are retained whereas those showing low or nonsignificant relationships are candidates for elimination. Lastly, a formative measurement model specification implies that the latent variable completely mediates the effects of its indicators on other (outcome) variables (see Figs. 4 and 5). This implies certain proportionality constraints on the model coefficients (Bollen and Davis, 1994; Hauser, 1973). If such proportionality constraints do not hold for a particular indicator, the validity of the latter is questionable (see Franke et al., 2008this issue).

1216

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

5.4.2.2. Construct validity. After examining validity at the individual indicator level, the next step involves assessing validity at the overall construct level. An important point in this regard is that “causal indicators are not invalidated by low internal consistency, so to assess validity we need to examine other variables that are effects of the latent variable” (Bollen and Lennox, 1991, p. 312, emphasis added). One common approach is focusing on nomological (Jarvis et al., 2003; MacKenzie et al., 2005; Reinartz et al., 2004) and criterionrelated (Diamantopoulos and Siguaw, 2006; Edwards, 2001; Jarvis et al., 2003) validity. MacKenzie et al. (2005) suggest proceeding as with reflective scales, that is, estimating hypothesized relationships of the focal construct with theoretically related constructs. These estimated relationships should be consistent with the expected direction and be significantly different from zero. Diamantopoulos and Winklhofer (2001) also underline the importance of nomological validation particularly in cases where indicators have been purified. Rossiter (2002, p. 327) challenges the approach of evaluating the validity of a formative index by relating it to other constructs by arguing that “[a] scale's validity should be established independently for the construct”. In response, Diamantopoulos (2005) points out that, by definition, all forms of validity — with the exception of face and content validity — are defined in terms of relationships with other measures (see Carmines and Zeller, 1979; Zeller and Carmines, 1980). Concerning other types of validity assessments, Bagozzi (1994, p. 338) states that “construct validity in terms of convergent and discriminant validity [is] not meaningful when indexes are formed as linear sums of measurement”. In contrast, MacKenzie et al. (2005) suggest that standard procedures for assessing discriminant validity are equally applicable to formative indexes, which include testing (a) whether the focal construct less than perfectly correlates with related constructs, and/or (b) whether it shares less than half of its variance with some other construct, that is, construct intercorrelation is less than .71 (Fornell and Larcker, 1981). Diamantopoulos (2006) proposes using the variance of the error term as an indication of construct validity. Since the error term captures aspects of the construct's domain that the set of indicators neglect, the lower the variance of the error term, the more valid the construct (see also Williams et al., 2003). If the set of indicators is comprehensive in the sense that it includes all important construct facets, the construct meaning is validly captured; accordingly, the residual variance is likely to be small. Finally, confirmatory tetrad analysis (CTA) (Bollen and Ting, 1993, 2000; Eberl, 2006, Gudergan et al., 2008-this issue) offers a basic test of construct validity. Although Bollen and Ting (2000, p. 4) originally propose CTA as “an empirical test of whether a causal or effect indicator specification is appropriate”, interpreting evidence supporting the latter as also supporting the construct's validity is reasonable. 6. Conclusion and future research Building on a review of literature relating to the specification, estimation, and validation of formative measurement

models, this article hopefully lends a helping hand to researchers considering the adoption of formative measurement in their empirical efforts, while, at the same time, encouraging a critical perspective in the application of formative indicators. Concerning future research, one major issue concerns the conceptual plausibility of formatively-measured constructs occupying endogenous positions in structural models. While a number of studies incorporate formative latent variables in such positions (e.g., Edwards, 2001; Jarvis et al., 2003; MacKenzie et al., 2005), Wiley (2005, p. 124, emphasis in original) notes that there is “no mechanism by which an antecedent variable can influence a formative index”. Since the set of causal indicators and the disturbance term jointly account for the total variation of a formatively-measured construct, the specification of an additional source of variation (i.e., an antecedent construct) is conceptually questionable. Given the conceptual and practical importance of this issue, a debate on the use of formatively-measured constructs as endogenous variables is urgently required. Another issue for future research concerns modeling formatively-measured constructs as moderator variables in structural models. Although literature provides empirical examples of employing formatively specified moderators (e.g., Reinartz et al., 2004), more research using formatively-measured constructs when forming interaction terms is needed. Finally, there is a debate on whether formative measurement is really necessary, that is, whether it should be used in the first place. Bagozzi (2007, p. 236), for example, states that, formative measurement can be done but only for a limited range of cases and under restrictive assumptions”, while Howell et al. (2007, p. 216; see also Howell et al., 2008-this issue) argue that “formative measurement is not an equally attractive alternative [to reflective measurement]”. Although there are those (including the authors and Bollen, 2007) who feel that, despite its various shortcomings, formative measurement is indeed a viable alternative to reflective measurement based on conceptual grounds, further theoretical and methodological research is necessary to finally settle this debate. Time will tell. References Albers S, Götz O. Messmodelle mit Konstrukten zweiter Ordnung in der betriebswirtschaftlichen Forschung. Betriebswirtschaft 2006;66(6):669–77. Albers S, Hildebrandt L. Methodische Probleme bei der Erfolgsfaktorenforschung — Messfehler, formative versus reflective Indikatoren und die Wahl des Strukturgleichungs-Modells. Zfbf 2006;58:2–33. Anderson J, Gerbing D. Some methods for respecifying measurement models to obtain unidimensional construct measurement. J Mark Res 1982;19(4): 453–60. Arnett DB, Laverie DA, Meiers A. Developing parsimonious retailer equity indexes using partial least squares analysis: a method and applications. J Retail 2003;79:161–70. Bagozzi RP. A prospectus for theory construction in marketing. J Mark 1984;48:11–29. Bagozzi RP. Structural equation models in marketing research: basic principles. In: Bagozzi RP, editor. Principles of marketing research. Oxford: Blackwell; 1994. p. 317–85. Bagozzi RP. On the meaning of formative measurement and how it differs from reflective measurement: comment on Howell, Breivik, and Wilcox. Psychol Methods 2007;12(2):229–37.

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218 Bagozzi RP, Heatherton TF. A general approach to representing multifaceted personality constructs: application to state self-esteem. Struct Equ Modeling 1994;1(1):35–67. Blalock HM. Causal inferences in nonexperimental research. Chapel Hill: University of North Carolina Press; 1964. Blalock HM. Theory building and causal inferences. In: Blalock HM, Blalock A, editors. Methodology in social research. New York: McGraw-Hil; 1968. p. 155–98. Blalock HM. Causal models involving unobserved variables in stimulusresponse situations. In: Blalock HM, editor. Causal models in the social sciences. Chicago: Aldine; 1971. p. 335–47. Bollen K. Multiple indicators: internal consistency or no necessary relationship? Qual Quant 1984;18:377–85. Bollen K. Structural equations with latent variables. New York: Wiley; 1989. Bollen K, Lennox R. Conventional wisdom on measurement: a structural equation perspective. Psychol Bull 1991;110(2):305–14. Bollen K. Latent variables in psychology and the social sciences. Annu Rev Psychol 2002;53:605–34. Bollen K. Interpretational confounding is due to misspecification, not to type of indicator: comment on Howell, Breivik, and Wilcox. Psychol Methods 2007;12(2):219–28. Bollen K, Ting K. Confirmatory tetrad analysis. In: Marsden PV, editor. Sociological methodology. Washington, D.C: American Sociological Association; 1993. p. 147–75. Bollen K, Davis W. Causal indicator models: identification, estimation, and testing. Paper presented at the American Sociological Association Convention, Miami; 1994. Bollen K, Ting K. A tetrad test for causal indicators. Psychol Methods 2000;5(1): 3–22. Brewer P. Operationalizing psychic distance: a revised approach. J Int Mark 2007;15(1):44–66. Brock JK, Zhou Y. Organizational use of the internet — scale development and validation. Internet Res 2005;15(1):67–87. Bruhn M, Georgi D, Hadwich K. Customer equity management as a formative second-order construct. Journal of Business Research 2008;61:1292–301 (this issue). doi:10.1016/j.jbusres.2008.01.016. Cadogan JW, Souchon AL, Procter DB. The quality of market-oriented behaviors: formative index construction and validation. Journal of Business Research 2008;61:1263–77 (this issue). doi:10.1016/j.jbusres.2008.01.014. Cantaluppi G. Some further remarks on parameter identification of structural equation models with both formative and reflexive relationships. In: A.A., V.V., editors. Studi in onore di Angelo Zanella. Milano: Vita e Pensiero; 2002a. p. 89–104. Cantaluppi G. The problem of parameter identification of structural equation models with both formative and reflexive relationships: some theoretical results. Serie Edizioni Provvisorie 2002b; No. 108, Istituto di Statistica, Università Cattolica del S. Cuore, Milano:1–19. Carmines EG, Zeller RA. Reliability and validity assessment. In: Sullivan JL, editor. Quantitative applications in the social sciences. Beverly Hills: Sage; 1979. Churchill GA. A paradigm for developing better measures of marketing constructs. J Mark Res 1979;16:64–73. Cohen P, Cohen J, Teresi J, Marchi M, Velez CN. Problems in the measurement of latent variables in structural equations causal models. Appl Psychol Meas 1990;14:183–96. Collier JE, Bienstock CC. Measuring service quality in e-retailing. J Serv Res 2006;8(3):260–75. Curtis RF, Jackson EF. Multiple indicators in survey research. Am J Sociol 1962;68:195–204. DeVellis Robert F. Scale development — theories and applications. Applied social research methods series. 2nd edition. Sage Publications; 2003. Diamantopoulos A. Viewpoint: export performance measurement: reflective versus formative indicators. Int Mark Rev 1999;16(6):444–57. Diamantopoulos A. The C-OAR-SE procedure for scale development in marketing: a comment. Int J Res Mark 2005;22:1–9. Diamantopoulos A. The error term in formative measurement models: interpretations and modelling implications. J Modell Manage 2006;1(1):7–17. Diamantopoulos A, Winklhofer H. Index construction with formative indicators: an alternative to scale development. J Mark Res 2001;38(2):269–77.

1217

Diamantopoulos A, Siguaw J. Formative versus reflective indicators in organizational measure development: a comparison and empirical illustration. Br J Manage 2006;17(4):263–82. Dowling G. Journalists' evaluation of corporate reputations. Corp Reputation Rev 2004;7(2):196–205. Duncan OD. Notes on social measurement: historical and critical. New York: Russell Sage; 1984. Eberl M. Formative und reflektive Konstrukte und die Wahl des Strukturgleichungsverfahrens. Betriebswirtschaft 2006;66(6):651–68. Eberl M, Schwaiger M. Corporate reputation: disentangling the effects on financial performance. Eur J Mark 2005;39:838–54. Edwards JR. Multidimensional constructs in organizational behavior research: an integrative analytical framework. Organ Res Methods 2001;4(2): 144–92. Edwards JR, Bagozzi R. On the nature and direction of relationships between constructs and measures. Psychol Methods 2000;5(2):155–74. Eggert A, Fassot G. Zur Verwendung formativer and reflektiver Indikatoren in Strukturgleichungsmodellen. Kaiserslaut Schr reihe Mark 2003;20:1–18. Fassot G. Operationalisierung latenter Variablen in Strukturgleichungsmodellen: Eine Standortbestimmung. Zfbf 2006;58:67–88. Fassot A, Eggert G. Zur Verwendung formativer und reflektiver Indikatoren in Strukturgleichungsmodellen: Bestandaufnahme und Anwendungsempfehlung. In: Bliemel FW, Eggert A, Fassot G, Henseler J, editors. Handbuch PLS-Modellierung, Methode, Anwendung, Praxisbeispiele. Stuttgart: Schaeffer-Poeschel; 2005. p. 31–47. Fayers PM, Hand DJ, Bjordal K, Groenvold M. Causal indicators in quality of life research. Qual Life Res 1997;6:393–406. Fornell C, Larcker DF. Evaluating structural equation models with unobservable variables and measurement error. J Mark Res 1981;18:39–50. Fornell C, Bookstein FL. A comparative analysis of two structural equation models: LISREL and PLS applied to market data. In: Fornell C, editor. A second generation of multivariate analysis, vol. 1. New York: Praeger; 1982. p. 289–324. Fornell C, Rhee BD, Yi Y. Direct regression, reverse regression, and covariance structure analysis. Mark Lett 1991;2(3):309–20. Franke G, Preacher C, Rigdon E. The proportional structural effects of formative indicators. Journal of Business Research 2008;61:1229–37 (this issue). doi:10.1016/j.jbusres.2008.01.011. Gardner D, Cummings L, Dunham R, Pierce J. Single-item versus multipleitem measurement scales: an empirical comparison. Educ Psychol Meas 1998;58(6):898–915. Giere J, Wirtz B, Schilke O. Mehrdimensionale Konstrukte: Konzeptionelle Grundlagen und Möglichkeiten ihrer Analyse mithilfe von Strukturgleichungsmodellen. Betriebswirtschaft 2006;66(6):678–95. Götz O, Liehr-Gobbers K. Analyse von Strukturgleichungsmodellen mit Hilfe der Partial-Least-Squares(PLS)-Methode. Betriebswirtschaft 2004;64(6): 714–38. Gudergan SP, Ringle CM, Wende S, Will A. Confirmatory tetrad analysis for evaluating the mode of measurement models in PLS path modeling. Journal of Business Research 2008;61:1238–49 (this issue). Hair JF, Anderson RE, Tatham RL, Black WC. Multivariate data analysis. New Jersey: Prentice Hall; 1998. Hauser RM. Diaggregating a social-psychological model of educational attainment. In: Goldberger AS, Duncan OD, editors. Structural equation models in the social sciences. San Diego: Academic Press; 1973. p. 255–89. Hauser RM, Goldberger AS. The treatment of unobservable variables in path analysis. Sociol Method 1971:81–117. Heise DR. Employing nominal variables, induced variables, and block variables in path analysis 1972; 1:147–173. Helm S. Designing a formative measure for corporate reputation. Corp Reputation Rev 2005;8(2):95–109. Hitt MA, Gimeno J, Hoskisson RE. Current and future research methods in strategic management. Organ Res Methods 1998;1:6–44. Homburg C, Klarmann M. Die Kausalanalyse in der empirischen betriebswirtschaftlichen Forschung – Problemfelder und Anwendungsempfehlungen. Betriebswirtschaft 2006;66(6):727–48. Homburg C, Workman JP, Krohmer H. Marketing's influence within the firm. J Mark 1999;63(2):1–17.

1218

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Homburg C, Hoyer W, Fassnacht M. Service orientation of a retailer's business strategy: dimensions, antecedents, and performance outcomes. J Mark 2002;66(4):86–101. Howell RD, Breivik E, Wilcox JB. Reconsidering formative measurement. Psychol Methods 2007;12(2):205–18. Howell RD, Breivik K, Wilcox JB. Questions about formative measurement. Journal of Business Research 2008;61:1219–28 (this issue). doi:10.1016/j. jbusres.2008.01.010. Hulland J. Use of partial least squares (PLS) in strategic management research: a review of four recent studies. Strateg Manage J 1999;20:195–204. Hyman M, Ganesh G, McQuitty S. Augmenting the household influence construct. J Mark Theory Pract 2002;10(3):13–31. Jarvis C, MacKenzie S, Podsakoff PA. Critical review of construct indicators and measurement model misspecification in marketing and consumer research. J Consum Res 2003;30(2):199–218. Johansson JK, Yip GS. Exploiting globalization potential: U.S. and Japanese strategies. Strateg Manage J 1994;15(8):579–601. Johnson GJ, Bruner II GC, Kumar A. Interactivity and its facets revisited. J Advert 2006;35(4):35–52. Jöreskog K, Goldberger A. Estimation of a model with multiple indicators and multiple causes of a single latent variable. J Am Stat Assoc 1975;10:631–9. Judge TA, Bretz RD. Person-organization fit and the theory of work adjustment: implications for satisfaction, tenure, and career success. J Vocat Behav 1994;44(1):32–54. Kennedy PA. Guide to econometrics. 5th edition. Boston: MIT Press; 2003. Land K. On estimation of path coefficients for unmeasured variables from correlations among observed variables. Soc Forces 1970;48:506–11. Law K, Wong C. Multidimensional constructs in structural equation analysis: an illustration using the job perception and job satisfaction constructs. J Manage 1999;25(2):143–60. Law KS, Wong CS, Mobley WH. Toward a taxonomy of multidimensional constructs. Acad Manage Rev 1998;23(4):741–55. Lin CH, Sher PJ, Shih HY. Past progress and future directions in conceptualizing customer perceived value. Int J Serv Ind Manag 2005;16(4):318–36. Long JS. Confirmatory factor analysis: a preface to LISREL. Bloomington, IN: Sage Publications; 1983. Lord FM, Novick MR. Statistical theories of mental test scores. Reading, MA: Addison-Wesely; 1968. MacCallum R, Browne M. The use of causal indicators in covariance structure models: some practical issues. Psychol Bull 1993;114(3):533–41. MacKenzie SB. The danger of poor construct conceptualization. J Acad Mark Sci 2003;31(3):323–6. MacKenzie S, Podsakoff P, Jarvis C. The problem of measurement model misspecification in behavioural and organizational research and some recommended solutions. J Appl Psychol 2005;90(4):710–30. Namboodiri NK, Carter LF, Blalock HM. Applied multivariate analysis and experimental designs. New York: McGraw-Hill; 1975. Netemeyer RG, Bearden WO, Sharma S. Scaling procedures. CA: Sage; 2003. Nunnally JC. Psychometric theory. 2nd edition. New York: McGraw-Hill; 1978. Nunnally JC, Bernstein IH. Psychometric theory. 3rd edition. New York: McGraw-Hill; 1994. Pavlou P, Gefen D. Psychological contract violation in online marketplaces: antecedents, consequences, and moderating role. Inf Syst Res 2005;16(4): 372–99. Podsakoff PM, MacKenzie SB, Podsakoff NP, Lee JY. The mismeasure of man (agement) and its implications for leadership research. Leadersh Q 2003;14:615–56.

Podsakoff NP, Shen W, Podsakoff PM. The role of formative measurement models in strategic management research: review, critique, and implications for future research. Res Methodol Strat Manag 2006;3:197–252. Reinartz W, Krafft M, Hoyer WD. The customer relationship management process: Ist measurement and impact on performance. J Mark Res 2004;41(3): 293–305. Rossiter J. The C-OAR-SE procedure for scale development in marketing. Int J Res Mark 2002;19:305–35. Ruiz DM, Gremler DD, Washburn JH, Capeda-Carrion G. Service value revisited: specifying a higher-order, formative measure. Journal of Business Research 2008;61:1278–91 (this issue). doi:10.1016/j.jbusres.2008.01.015. Sànchez-Pérez M, Iniesta-Bonillo M. Consumers felt commitment towards retailers: index development and validation. J Bus Psychol 2004;19(2): 141–59. Santosa PI, Wei KK, Chan HC. User involvement and user satisfaction with information-seeking activity. European Journal of Information Systems: including a special section on the Pacific Asia conference, vol. 14(4). ; 2005. p. 361–70. Scholderer J, Balderjahn I. Was unterscheidet harte und weiche Strukturgleichungsmodelle nun wirklich? Mark ZFP 2006;28(1):57–70. Spector PE. Summated rating scale construction: an introduction. Series: quantitative applications in the social sciences. CA: Sage Publications; 1992. Temme D. Die Spezifikation und Identifikation formativer Messmodelle der Marketingforschung in Kovarianzstrukturanalysen. Mark ZFP 2006;28(3): 183–209. Ulaga W, Eggert A. Value-based differentiation in business relationships: gaining and sustaining key supplier status. J Mark 2006;70(1):119–36. Venaik S, Midgley DF, Devinney TM. A new perspective on the integrationresponsiveness pressures confronting multinational firms. Manag Int Rev 2004;44(Special Issue 2004/1):15–48. Venaik S, Midgley DF, Devinney TM. Dual paths to performance: the impact of global pressures on MNC subsidiary conduct and performance. J Int Bus Stud 2005;36(6):655–75. Wiley J. Reflections on formative measures: conceptualization and implication for use. ANZMAC Conference, Perth; December 5–7; 2005. Williams LJ, Edwards JR, Vandenberg RJ. Recent advances in causal modeling methods for organizational and management research. Journal of Management 2003;29(6):903–36. Williams LJ, Gavin MB, Hartman NS. In: Ketchen DJ, Berg DD, editors. Structural equation modeling methods in strategy research: applications and issues. Research Methodology in Strategy ManagementBoston, MA: Elsevier; 2004. p. 303–46. Winklhofer H, Diamantopoulos A. Managerial evaluation of sales forecasting effectiveness: a MIMIC model approach. Int J Res Mark 2002;19:151–66. Witt P, Rode V. Corporate brand building in start-ups. J Enterp Cult 2005;13(3): 273–94. Yi MY, Davis FD. Developing and validating an observational learning model of computer software training and skill acquisition. Inf Syst Res 2003;14(2): 146–69. Zeller RA, Carmines EG. Measurement in the social sciences — the link between theory and data. CA: Cambridge University Press; 1980.

Journal of Business Research 61 (2008) 1203 – 1218

Advancing formative measurement models Adamantios Diamantopoulos ⁎, Petra Riefler 1 , Katharina P. Roth 2 Department of Business Administration, University of Vienna, Bruenner Strasse 72, A-1210 Vienna, Austria Received 1 May 2007; received in revised form 1 November 2007; accepted 1 January 2008

Abstract Formative measurement models were first introduced in the literature more than forty years ago and the discussion about their methodological contribution has been increasing since the 1990s. However, the use of formative indicators for construct measurement in empirical studies is still scarce. This paper seeks to encourage the thoughtful application of formative models by (a) highlighting the potential consequences of measurement model misspecification, and (b) providing a state-of-the art review of key issues in the formative measurement literature. For the former purpose, this paper summarizes findings of empirical studies investigating the effects of measurement misspecification. For the latter purpose, the article merges contributions in the psychology, management, and marketing literatures to examine a variety of issues concerning the conceptualization, estimation, and validation of formative measurement models. Finally, the article offers some suggestions for future research on formative measurement. © 2008 Elsevier Inc. All rights reserved. Keywords: Formative index; Measurement model; Causal indicators

Contents 1. 2. 3. 4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Reflective vs. formative measurement: first-order models . . Higher-order formative models . . . . . . . . . . . . . . . Measurement model misspecification . . . . . . . . . . . . 4.1. Parameter bias due to reversed causality . . . . . . 4.2. Parameter bias due to incorrect item purification . . 4.3. Effects on fit statistics . . . . . . . . . . . . . . . . The status quo of formative measures: issues and proposed 5.1. Conceptual issues . . . . . . . . . . . . . . . . . . 5.1.1. Error-free measures. . . . . . . . . . . . . 5.2. Interpretation of the error term . . . . . . . . . . . 5.3. Estimation of formative models . . . . . . . . . . . 5.3.1. Multicollinearity . . . . . . . . . . . . . . 5.3.2. Exogenous variable intercorrelations . . . . 5.3.3. Model identification . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . remedies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

1204 1204 1205 1208 1208 1210 1210 1211 1211 1211 1211 1212 1212 1212 1213

⁎ Corresponding author. Tel.: +43 1 4277 38031. E-mail addresses: [email protected] (A. Diamantopoulos), [email protected] (P. Riefler), [email protected] (K.P. Roth). 1 Tel.: +43 1 4277 38038. 2 Tel.: +43 1 4277 38040. 0148-2963/$ - see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jbusres.2008.01.009

1204

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

5.4.

Reliability and validity assessment of formative 5.4.1. Reliability assessment . . . . . . . . . 5.4.2. Validity assessment . . . . . . . . . . 6. Conclusion and future research . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .

models . . . . . . . . . . . . . . . . . . . . .

1. Introduction The literature in psychology, management, and marketing pays increasing attention to formative measurement models for operationalizing latent variables (constructs). Researchers in various disciplines have undertaken considerable effort to (a) make the academic community aware of the existence of formative (cause, causal) indicators (e.g., Bollen and Lennox, 1991), (b) demonstrate the potential appropriateness of formative measurement models for a large number of latent constructs (e.g., Diamantopoulos, 1999; Fassot and Eggert, 2005; Fassot, 2006; Jarvis, MacKenzie and Podsakoff, 2003; Venaik, Midgley and Devinney, 2004), (c) reveal consequences of measurement model misspecification (e.g., Diamantopoulos and Siguaw, 2006; Law and Wong, 1999; MacKenzie, Podsakoff and Jarvis, 2005), and (d) develop practical guidelines for the construction of multi-item measures (indexes) comprising formative indicators (e.g., Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003; Giere, Wirtz and Schilke, 2006). Despite the growing number of contributions on formative measurement, however, Bollen's (1989, p. 65) statement still holds true as even a cursory glance in the top management and marketing journals readily reveals, that is, “[M]ost researchers in the social sciences assume that indicators are effect indicators. Cause indicators are neglected despite their appropriateness in many instances”. Two reasons help explain the prevalent lack of applications. On the one hand, a substantial number of researchers engaging in measure development might still be unaware of the potential appropriateness of formative indicators for operationalizing particular constructs (Hitt, Gimeno and Hoskisson, 1998; Podsakoff, Shen and Podsakoff, 2006); indeed “nearly all measurement in psychology and the other social sciences assumes effect indicators” (Bollen, 2002, p. 616). On the other hand, researchers might hesitate to specify formative measurement models because they “are often uncertain how to incorporate them into structural equation models” (Bollen and Davis, 1994, p. 2). Indeed, there are a number of controversial and not fully resolved issues concerning the conceptualization, estimation and validation of formative measures (e.g., see Howell et al., 2007, 2008this issue) including, among others, the treatment of indicator multicollinearity, the assessment of indicator validity, and the interpretation of formatively-measured constructs. This article provides insights into the current state of literature on formative measurement by merging major contributions in the psychology, management and marketing literatures into an overall picture. The overall aim is to encourage the appropriate use of formative indicators in empirical research while at the same time highlighting potentially problematic issues and suggested remedies.

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1215 1215 1215 1216 1216

The section that follows provides a brief conceptual discussion of reflective and formative measurement models. The subsequent section covers the problem of measurement model misspecification, followed by a discussion of its consequences. Next, the article turns attention to a number of critical issues concerning the specification, estimation, and validation of formative measures. Finally, the paper concludes by proposing some directions for future research. 2. Reflective vs. formative measurement: first-order models The assessment of latent variables has a long tradition in social science (e.g., Churchill, 1979; Duncan, 1984; Nunally, 1978). Latent variables are phenomena of theoretical interest which cannot be directly observed and have to be assessed by manifest measures which are observable. In this context, a measurement model describes relationships between a construct and its measures (items, indicators), while a structural model specifies relationships between different constructs (Edwards and Bagozzi, 2000; Scholderer and Balderjahn, 2006). Anderson and Gerbing (1982, p. 453) note that “the reason for drawing a distinction between the measurement model and the structural model is that proper specification of the measurement model is necessary before meaning can be assigned to the analysis of the structural model”. The measurement model (which is of focal interest in this paper) specifies the relationship between constructs and measures. In this respect, the direction of the relationship is either from the construct to the measures (reflective measurement) or from the measures to the construct (formative measurement). The first form of specification, that is, the reflective measurement model (see Fig. 1, Panel 1), has a long tradition in social sciences and is directly based on classical test theory (Lord and Novick, 1968). According to this theory, measures denote effects (or manifestations) of an underlying latent construct (Bollen and Lennox, 1991). Therefore, causality is from the construct to the measures. Specifically, the latent variable η represents the common cause shared by all items xi reflecting the construct, with each item corresponding to a linear function of its underlying construct plus measurement error: x i ¼ ki g þ e i

ð1Þ

where xi is the ith indicator of the latent variable η, εi is the measurement error for the ith indicator, and λi is a coefficient (loading) capturing the effect of η on xi. Measurement errors are assumed to be independent (i.e., cov(εi, εj) = 0, for i ≠ j) and unrelated to the latent variable (i.e., cov(η, εi) = 0, for all i).

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

1205

Fig. 1. Alternative measurement models.

Eq. (1) is a simple regression equation where the observable measure is the dependent variable and the latent construct is the explanatory variable. A fundamental characteristic of reflective models is that a change in the latent variable causes variation in all measures simultaneously; furthermore, all measures in a reflective measurement model must be positively intercorrelated (for a proof, see Bollen, 1984). The second form of specification, that is, the formative measurement model, was first proposed by Curtis and Jackson (1962) who challenge the characteristic of positively correlated measures as a necessary condition. They argue that in specific cases measures show negative or zero correlations despite capturing the same concept. Blalock (1964, 1968, 1971) and Land (1970) subsequently discuss this alternative measurement perspective according to which measures are causes of the construct rather than its effects (see Fig. 1, Panel 2). In other words, the indicators determine the latent variable which receives its meaning from the former. Some typical examples are socio-economic status (Hauser and Goldberger, 1971; Hauser, 1973), quality of life (e.g., Bollen and Ting, 2000, Fayers, Hand, Bjordal and Groenvold, 1997;), or career success (e.g., Judge and Bretz, 1994); Table 1 provides further examples. The formal specification of the formative measurement model is: g¼

n X

gi xi þ f

ð2Þ

i¼1

where γi is a coefficient capturing the effect of indicator xi on the latent variable η, and ζ is a disturbance term. The latter comprises all remaining causes of the construct which are not represented in the indicators and are not correlated to the latter; thus following the assumption that cov(xi,ζ) = 0. Eq. (2) represents a multiple regression equation and in contrast to Eq. (1), the latent variable is the dependent variable and the indicators are the explanatory variables. Diamantopoulos and Winklhofer (2001) point out several characteristics of this model which make it sharply distinct from the reflective model. First, the indicators characterize a set of distinct causes which are not interchangeable as each indicator captures a specific aspect of

the construct's domain (see also Jarvis et al., 2003; and Rossiter, 2002); indeed, omitting an indicator potentially alters the nature of the construct (Bollen and Lennox, 1991). Second, there are no specific expectations about patterns or magnitude of intercorrelations between the indicators; formative indicators might correlate positively or negatively or lack any correlation (for a detailed discussion see Bollen, 1984). Third, formative indicators have no individual measurement error terms, that is, they are assumed to be error-free in a conventional sense (Edwards and Bagozzi, 2000). The error term (ζ) is specified at the construct level (MacCallum and Browne, 1993) and does not constitute measurement error (Diamantopoulos, 2006). Fourth, a formative measurement model, in isolation, is underidentified and, therefore, cannot be estimated (Bollen, 1989; Bollen and Davis, 1994). In contrast, reflective measurement models with three or more indicators are identified and can be estimated (e.g., see Long, 1983). A later section of this paper addresses the estimation of formative models. 3. Higher-order formative models The formative model specified in Eq. (2) is a first-order measurement model (Edwards, 2001). However, constructs are often conceptualized and subsequently operationalized as multidimensional entities (e.g., Brewer, 2007; Lin, Sher and Shih, 2005; Venaik et al., 2004; Yi and Davis, 2003). From a conceptual point of view, a construct is multidimensional “when it consists of a number of interrelated attributes or dimensions and exists in multidimensional domains. In contrast to a set of interrelated unidimensional constructs, the dimensions of a multidimensional construct can be conceptualized under an overall abstraction, and it is theoretically meaningful and parsimonious to use this overall abstraction as a representation of the dimensions” (Law, Wong and Mobley, 1998, p. 741). When dealing with multidimensional constructs, it is necessary to distinguish between (at least) two levels of analysis, that is, one level relating manifest indicators to (first-order) dimensions, and a second level relating the individual dimensions to the (second-order) latent construct (Jarvis et al., 2003; MacKenzie et al., 2005). Failing to carefully specify the latter relationships, “one cannot derive the overall

1206

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Table 1 Examples of formatively-measured constructs Author(s)

Journal

Formative construct(s)

Estimation method

Customer perceived value

SEM a

Hyman et al. (2002) Sánchez-Pérez and Iniesta-Bonillo (2004)

International Journal of Service Industry Management Journal of Marketing Theory and Practice Journal of Business and Psychology

Household affluence Consumers' commitment towards retailers

MIMIC model MIMIC model

Information technology literature Brock and Zhou (2005) Pavlou and Gefen (2005)

Internet Research Information Systems Research

SEM (PLS) a SEM (PLS) a

Santosa et al. (2005)

European Journal of Information Systems

Yi and Davis (2003)

Information Systems Research

Organizational internet use Psychological contract violation Perceived effectiveness of institutional structures Intrinsic motivators Situational motivators Observational learning

Management literature Helm (2005) Venaik et al. (2005)

Corporate Reputation Review Journal of International Business Studies

Witt and Rode (2005)

Journal of Enterprising Culture

Dowling (2004)

Corporate Reputation Review

Venaik et al. (2004) Johansson and Yip (1994)

Management International Review Strategic Management Journal

Marketing literature Bruhn et al. (2008-this issue) Cadogan et al. (2008-this issue)

Consumer behavior literature Lin et al. (2005)

SEM (PLS) a SEM (PLS) a

Firm reputation Environmental controls: – Local government regulatory influence – Quality of local business infrastructure – Pressures of global competition – Pressures from technological change Corporate identity Corporate culture Corporate descriptors Corporate reputation Firm pressures Industry drivers Organization structure Management process Global strategy

SEM (PLS) a SEM (PLS) a

In this Special Issue In this Special Issue

Customer equity management Quality of market-oriented behaviors

Brewer (2007) Collier and Bienstock (2006) Johnson et al. (2006)

Journal of International Marketing Journal of Service Research Journal of Advertising

Ulaga and Eggert (2006)

Journal of Marketing

Psychic distance e-service quality Perceived interactivity: – Reciprocity – Responsiveness – Nonverbal information – Speed of response Relationship value

SEM (PLS, LISREL) MIMIC model (LISREL) n.a. SEM (AMOS) a SEM (EQS) a

Reinartz et al. (2004) Arnett et al. (2003) Homburg et al. (2002) Winklhofer and Diamantopoulos (2002) Homburg et al. (1999)

Journal of Marketing Research Journal of Retailing Journal of Marketing International Journal of Research in Marketing Journal of Marketing

a

CRM process implementation Retailer equity Service orientation Sales forecasting effectiveness Marketing's influence Market-related complexity

SEM (PLS) a Regression model SEM (PLS) a SEM (PLS) a

SEM with summated dimension scores (PLS) MIMIC model MIMIC model Composite score MIMIC model SEM (LISREL) a

Identification achieved through linkage to two or more reflective constructs.

construct from its dimensions and can only conduct research at the dimensional level, even though these dimensions are claimed theoretically to be under an overall construct“ (Law et al., 1998, p. 741). Since for each level both formative or reflective specifications are applicable, Jarvis et al. (2003) identify four different types of multidimensional constructs, namely, (a) formative first-order and formative second-order (synonyms for this model are “aggregate model”, “composite model”, “emergent model” and “indirect formative model”; e.g.,

see Cohen et al., 1990; Edwards and Bagozzi, 2000; Giere et al., 2006; Law et al., 1998; Law and Wong, 1999), (b) reflective first-order and formative second-order, (c) formative first-order and reflective second-order, and (d) reflective first-order and reflective second-order models (synonyms for this type of model are “latent model”, “factor model”, “superordinate construct”, “indirect reflective model” and “second-order total disaggregation model”; see Bagozzi and Heatherton, 1994; Edwards, 2001; Edwards and Bagozzi, 2000; Giere et al., 2006;

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Law et al., 1998). Since this review focuses on formative measurement, this section only briefly discusses the former three types of multidimensional constructs (see Fig. 2). The first model in Fig. 2 (Type I) conceptualizes the multidimensional construct as a composite of its dimensions such that the arrows point from the dimensions to the construct (Williams, Edwards and Vandenberg, 2003). The dimensions are thus analogous to formative measures; however, in contrast to the traditional conceptualization of formative measures as observed variables (see Eq. (2)), the dimensions are themselves constructs and conceived as specific components of the secondorder construct (Edwards, 2001). In this type of model, the error term exists both at the level of the individual (first-order) dimensions and at the overall construct level. Table 1 provides a number of empirical illustrations of Type I formative multidimensional constructs (e.g., Arnett, Laverie, and Meiers, 2003; Brewer, 2007; Reinartz, Krafft and Hoyer, 2004; Venaik et al.,

1207

2004; Venaik, Midgley and Devinney, 2005; Witt and Rode, 2005; Yi and Davis, 2003; see also Bruhn et al., 2008-this issue). For example, Yi and Davis' (2003) construct of “observational learning processes” comprises four formative firstorder dimensions, namely “attention processes”, “retention processes”, “production processes”, and “motivation processes”. The second type of model shown in Fig. 2 (Type II) represents a second-order construct with first-order formative dimensions which are themselves measured by several reflective manifest items. According to this conceptualization, the error term exists at two different levels, namely (a) at the level of the manifest indicators, where it represents measurement error, and (b) at the level of the second-order construct, where it captures the amount of variance in the second-order construct which the first-order dimensions do not account for. As Type II models have been introduced rather recently, only most recent literature provides empirical examples of its use (e.g., Johnson, Bruner and Kumar

Fig. 2. Higher-order formative models (adapted from Jarvis et al., 2003, p. 205).

1208

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

et al., 2006; Lin et al., 2005; see also Ruiz et al., 2008-this issue). For example, Lin et al. (2005) conceptualize the construct of “customer perceived value” as a second-order factor which is formed by five reflectively specified first-order dimensions, namely “monetary sacrifice”, “website design”, “fulfillment/ reliability”, and “security/privacy”. The third model illustrated in Fig. 2 (Type III) has first-order factors as reflective dimensions, but the first-order dimensions themselves have formative indicators. For this reason, the error term exists at the level of the first-order dimensions only and represents both the variance not explained by the manifest indicators (due to the formative specification of the first-order dimensions) and variance not explained by the underlying (higher-order) construct. Although Jarvis et al. (2003) include this model in their typology, literature has not explicitly recognized this kind of model, and empirical examples remain virtually non-existent. The reasons for this are threefold (see also Albers and Götz, 2006). First, as noted above, the nature of the error term is difficult to interpret due to the endogenous position of the formative first-order dimensions. Second, formative indicators capture different facets of a construct and are therefore not interchangeable (Diamantopoulos and Winklhofer, 2001). These indicators give the first-order dimensions their meaning, which, by definition, has to be different for each dimension because of the formative specification (Rossiter, 2002). Since a reflective specification at the second-order level implies that the dimensions are manifestations of a second-order construct, it is unclear whether the meaning of the dimensions is attributable to the formative indicators or to the underlying common cause. Third, Type III models cannot be estimated using current procedures for achieving identification of formative constructs (see section on model identification later in this paper). In short, Type III models do not represent an appealing option for specifying multidimensional constructs. 4. Measurement model misspecification A number of researchers criticize the prevalent neglect of explicit measurement model specification underlying scale construction efforts (Diamantopoulos and Winklhofer, 2001; Eberl, 2006; Fassot, 2006; Fassot and Eggert, 2005; Fornell and Bookstein, 1982; Jarvis et al., 2003; Podsakoff et al., 2006). Most researchers apply scale development procedures without even questioning their appropriateness for the specific construct at hand (see also Albers and Hildebrandt, 2006; Williams et al., 2004; for a noteworthy exception see Eberl and Schweiger, 2005); indeed, Diamantopoulos and Winklhofer (2001, p. 274) speak of an “almost automatic acceptance of reflective indicators”. Consequently, misspecification commonly concerns the adoption of reflective indicators where formative indicators (and thus index construction approaches) would be appropriate (which is a Type I error in Diamantopoulos and Siguaw's (2006) terminology). The other case of misspecification, that is, the incorrect adoption of a formative model where indeed a reflective would be appropriate (Type II error), is rather negligible (Fassot, 2006; Jarvis et al., 2003). An explanation for this difference in evidence of Type I and Type II errors is the

fact that standardized development procedures for reflective scales have been established over the years (e.g., see Churchill, 1979; DeVellis, 2003; Netemeyer, Bearden and Sharma, 2003; Spector, 1992), whereas concrete guidelines for the construction of formative indixes have been proposed very recently (Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003; Giere et al., 2006). Jarvis et al. (2003) assess the degree of misspecification for studies published in four major marketing journals (Journal of Marketing, Journal of Marketing Research, Marketing Science, and Journal of Consumer Research). Even though they apply a conservative evaluation approach (i.e., classifying operationalizations as correct in case that either reflective or formative measures could in general apply), they find about a third of all studies to be subject to measurement model misspecification. Fassot (2006) applies Jarvis et al.'s (2003) approach to three major German management journals (Zeitschrift für Betriebswirtschaft, Zeitschrift für betriebswirtschaftliche Forschung, Die Betriebswirtschaft) and reports similar results (i.e., 35% of all investigated studies include misspecified constructs). In a similar effort, Fassot and Eggert (2005) calculate a misspecification rate of some 80% for a major German marketing journal (Marketing ZFP). This problematic situation is not unique to marketing literature. In similar efforts, Podsakoff et al. (2006) reveal inappropriate modeling for 62% of constructs published in three major strategic management journals (Academy of Management Journal, Administrative Science Quarterly, Strategic Management Journal), while Podsakoff, MacKenzie, Podsakoff and Lee (2003) report a misspecification rate of 47% for leadership research (including publications in The Leadership Quarterly, Journal of Applied Psychology, and again Academy of Management Journal). Given this documented existence of measurement model misspecification, the obvious question is to what extent misspecification does impact on model estimates and fit statistics. This question is important because “any bias in the estimates […] could affect the conclusions about the theoretical relationships among the constructs drawn from the research” (Jarvis et al., 2003, p. 207). The literature review identifies six studies empirically investigating the consequences of measurement model misspecification. Table 2 categorizes these studies along two characteristics. The first characteristic refers to the source of bias investigated, which is either (a) the wrongly specified direction of causality between a given set of indicators and a construct, or (b) the application of an inappropriate item purification procedure (i.e., purifying formative indicators according to guidelines applicable for reflective indicators). The second characteristic refers to the position of the focal misspecified construct in the structural model, which is either exogenous or endogenous. A discussion of the findings of these studies follows. 4.1. Parameter bias due to reversed causality Jarvis et al. (2003), Law and Wong (1999), and MacKenzie et al. (2005) examine the impact of incorrect causal direction, that is, the specification of a reflective measurement model

Table 2 Empirical studies on consequences of measurement model misspecification Technique

Model fit b

Exogenous construct misspecified

Endogenous construct misspecified

Exogenous construct misspecified

Overestimation

CFI ≈ NFI ≈ NNFI ≈ IFI ≈ TLI ≈ χ2 / df ↑ Over- and CFI ↓ underestimation of RMSEA ↑ some parameters χ2 / df ↑

Data set

Law and Wong (1999)

Reversed causality

Survey data

Edwards (2001)

Reversed causality

Published SEM (LISREL, Underestimation c covariance matrices RAMONA) of survey data

Jarvis et al. (2003)

Reversed causality

Simulated data

Monte Carlo simulation

MacKenzie et al. (2005)

Reversed causality

Simulated data

Monte Carlo simulation

SEM (RAMONA)

Albers and Reversed causality and Simulated data Hildebrandt (2006) incorrect indicator purification

SEM (PLS, LISREL)

Diamantopoulos and Siguaw (2006)

Regression analysis

a b c d

Incorrect indicator purification Survey data

Not tested

CFI ≈ GFI ↑ RMSEA ≈ SRMR ≈ χ2 / df ↑ Overestimation Underestimation CFI ↓ (on average: 429%) (on average: 84%) GFI ≈ RMSEA ↑ SRMR ≈ No bias Not tested Not given (stated that fit indices were similarly good) Underestimation d Not tested CFI ≈ GFI ≈ RMSEA ↑ NNFI ≈ χ2 / df ↑

Overestimation (335% to 555%)

Additional findings Endogenous construct misspecified Not tested

Also biases in model relationships which do not involve the misspecified construct

Not comparable (df = 0 and perfect fit of formative models)

Concluded that both, the multidimensional formative and reflective specification were inferior to a multivariate structural model Item correlation found to be negatively related to magnitude of estimation bias

CFI ↓ GFI ↓ RMSEA ↑ SRMR ↑ Not tested

Type II error increases if endogenous or both constructs are misspecified

Underestimation (88% to 93%)

Not tested

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Structural parameter estimates a

Focus (reason for estimation bias)

Unstandardized parameter estimates. ≈ Goodness-of-fit index for reflective and formative model similar (difference +/−.05). Edwards (2001) estimates several second-order models, this comparison concerns the Congeneric and Estimated Loadings Models. R-squares compared.

1209

1210

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

when a formative model is conceptually appropriate, for exogenous latent variables. All three studies reveal an overestimation of structural parameters when the latent variable is affected by misspecification. In some cases, the (incorrect) reflective specification even yields a significant parameter estimate, whereas the parameter estimate is not significant in the (correct) formative specification. Thus, the impact of the focal latent variable on other constructs in the structural model tends to be overestimated. Jarvis et al. (2003) and MacKenzie et al. (2005) additionally examine the impact of incorrect specifications of endogenous latent variables. In contrast to the exogenous case, both studies report an underestimation of the parameter estimate capturing the impact of antecedent variables on the focal construct. An explanation for these distinct findings of under- and overestimation for endogenous and exogenous positions, respectively, is the difference in portions of variance accounted for by reflective and formative operationalizations. Specifically, a reflective treatment of a formative construct reduces the variance of the construct (see Fornell, Rhee and Yi, 1991 or Namboodiri, Carter and Blalock, 1975) because the variance of a reflectively-measured construct equals the common variance of its measures, whereas the variance of a formatively-measured construct encompasses the total variance of its measures (Law and Wong, 1999). Consequently, if a misspecification reduces the variance of the exogenous variable while the level of the variance of the endogenous variable is maintained, the parameter estimate for their relationship increases. In contrast, if a misspecification reduces the variance of the endogenous variable while the variance of the exogenous variable is unchanged, the relevant structural parameter estimate decreases. In any case, these analyses reveal that structural paths are either overestimated or underestimated as a result of measurement model misspecification with undesirable effects on the substantive interpretation of the structural model relationships. 4.2. Parameter bias due to incorrect item purification To fully capture the meaning of a formatively-measured construct, a census of indicators is (ideally) required because “[o]mitting an indicator is omitting a part of the construct” (Bollen and Lennox, 1991, p. 308). Therefore, an omission of indicators is equivalent to restricting the domain of the construct (MacKenzie et al., 2005). In the context of index construction, this characteristic implies that the elimination of formative items from the item pool has to be theoretically justified rather than purely based on statistical properties (Diamantopoulos and Winklhofer, 2001; Diamantopoulos and Siguaw, 2006). Indeed, “internal-consistency checks on cause-indicators may lead researchers to discard valid measures improperly” (Bollen, 1984, p. 381) and “following standard scale development procedures – for example dropping items that possess low itemto-total correlations – will remove precisely those items that would most alter the empirical meaning of the construct” (Jarvis et al., 2003, p. 202; see also MacKenzie, 2003). In light of the extensive presence of measurement model misspecification discussed earlier, recent studies examine the

consequences of applying conventional scale development procedures on formative measures. Fassot (2006) provides an example of a misspecified measure that leads to a neglect of a key aspect of the focal construct. More specifically, “perceived friendliness of the staff” is erroneously dropped from a measure of hospital quality due to not meeting conventional standards for reflective items (i.e., high item-total correlations), however, despite being a key aspect of a hospital's quality assessment. In line with this example, Diamantopoulos and Siguaw (2006) find that the same initial item pool results in considerably differing final item sets under reflective and formative purification guidelines respectively. The former approach eliminates items with low inter-item correlations, whereas the latter drops items with high inter-item correlations (thus causing problems of multicollinearity). In Diamantopoulos and Siguaw's (2006) example, the resulting scale and index share not more than two out of 30 initial items. Their study therefore demonstrates how erroneous reflective scale purification processes can substantially alter the meaning of formative constructs. Albers and Hildebrandt (2006) address the issue of parameter estimation bias due to incorrect indicator purification. First, the authors compare parameter estimates of a reflectively and a formatively specified measurement model using the full item set prior to purification. Second, they compare two formatively specified models, once using the full item pool (i.e., in accordance with the requirement of a census of items) and once using a reduced item set following purification guidelines for reflective scales. The latter comparison reveals an extensive underestimation of structural parameters, while the former shows no significant differences. Therefore, in this example, it is the erroneous purification rather than the causal order misspecification that impacts on parameter bias. 4.3. Effects on fit statistics The studies in Table 2 also examine the impact of misspecification on goodness-of-fit indices for the overall (i.e., measurement and structural) model. An intuitive expectation is that the consequences of misspecification in terms of changed construct meanings and biased parameter estimates would also lead to poor model fit. However, the majority of models incorporating misspecified constructs show highly acceptable values for CFI, GFI, SRMR and RMSEA. Moreover, these values are similar to the goodness-of-fit values obtained for the corresponding correctly specified model. For example, MacKenzie et al. (2005, p. 724) conclude from their study that “each of the four goodness-of-fit indices failed to detect the misspecification of the measurement model”. This equally applies to all other studies listed in Table 2. Only the chi-square (per degree of freedom) statistic shows to be consistently higher in the wrongly reflectively specified models throughout the studies, thus providing some indication of the underlying misspecification. Summarizing, all studies empirically examining the consequences of measurement model misspecification on parameter estimates report serious under- or overestimation of parameters as a consequence of misspecified causality, wrongly

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

adopted purification procedures, or a combination of both. Such biases may in turn lead to incorrect conclusions on tested relationships, thus putting many empirical results into question. Especially alarming is the fact that a satisfactory overall model fit does not guarantee a correct specification and that misspecifications are not detected by poor fit index values. It is to hope that the empirical demonstration of the undesirable consequences of measurement model misspecification will yield more echo to Bagozzi's (1984) and Jarvis et al.'s (2003) call to conceptually justify measurement relationships as hypotheses and subsequently test them empirically. 5. The status quo of formative measures: issues and proposed remedies As the introductory section already outlines, literature has only recently started to pay serious attention to formative measurement models and empirical applications are still rare. As a result, experience with formative measures is limited and several conceptual and practical issues are not fully clarified yet. The following sections discuss such issues and highlight various (sometimes contradicting) views of proposed remedies. 5.1. Conceptual issues 5.1.1. Error-free measures Formative measurement models incorporate the error term at the construct level and specify individual indicators to be errorfree (see Eq. (2) earlier). Some researchers find this nonexistence of measurement error hard to accept. Edwards and Bagozzi (2000), for example, regard such an assumption as untenable in most situations. Addressing this objection, a model such as the one depicted in Fig. 3 is worth considering as one way of incorporating measurement error to formative measurement models. This model is similar to Edwards and Bagozzi's (2000) “spurious model” with multiple common causes, with the only difference that the latent variables are intentionally introduced to enable the accommodation of measurement error in the indicators. This model inserts a latent variable ξi for each formative indicator xi so that the focal latent variable η is indirectly linked to the indicators xi via the latent exogenous constructs ξi. Doing

Fig. 3. Modified formative model with individual error terms.

1211

so, each formative indicator becomes a (single) reflective measure of its respective latent variable ξi and consequently comprises an error term; hence, the assumption of error-free indicators is relaxed. Although this model has the substantial advantage of incorporating measurement error, its conceptual justification is questionable for several reasons. First, the inclusion of the firstorder constructs ξi introduces a “fictitious” level, which adversely affects model parsimony and suggests that a latent variable can more or less automatically be specified for any manifest variable. Second, given that the xi are not directly linked to η, they cannot be legitimately considered to be indicators of η because indicators need to be linked by means of a direct relationship to the construct they assess. Third, the measures of the ξi in Fig. 3 are single-indicators with all drawbacks such indicators entail (such as high specificity, and low reliability). As a discussion of potential problems with single item measures is beyond the scope of this paper, the reader is referred to Gardner, Cummings, Dunham and Pierce (1998) and Nunnally and Bernstein (1994) for further details. 5.2. Interpretation of the error term Eq. (2) and Fig. 1 (Panel 2) show that a formative measurement model specification includes an error (disturbance) term at the construct level. This error term represents the surplus meaning of the construct (Jarvis et al., 2003; Temme 2006) which is not captured by the set of formative indicators included in the model specification. Diamantopoulos (2006, p. 7) points out that “previous discussions of the error term are often problematic and fail to provide […] a clear interpretation of exactly what the error term represents”. Jarvis et al. (2003), for example, describe the error term as the collective (i.e., overall) random error of all formative indicators taken as a group, while MacKenzie et al. (2005, p. 712) interpret the error estimate as capturing “the invalidity of the set of measures — caused by measurement error, interactions among the measures, and/or aspects of the construct domain not represented by the measures”. However, the first source of error, that is, measurement error, is conceptually incorrect. Diamantopoulos (2006) demonstrates that the error term does not represent measurement error because formative indicators are specified to be error-free and, therefore, measurement error cannot be included in the error term at the construct level. The second source, that is, measure interactions, is statistically plausible but lacks substantive interpretation. Since formative indicators determine the meaning of the latent variable, it is not possible to separate the construct's meaning from the indicators' content (Diamantopoulos, 2006). If two indicators show interaction effects, these effects would also form the construct's meaning as both indicators separately do. The third source, is indeed the correct interpretation of the nature of the error term, that is, aspects of the construct domain not represented by the indicators. Specifically, “the error term in a formative measurement model represents the impact of all remaining causes other than those represented by the indicators included in the model” (Diamantopoulos, 2006, p. 11). Formative latent variables have a number of proximal causes which researchers try to

1212

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

identify when conceptually specifying the construct. However, in many cases researchers will be unable to detect all possible causes as there may be some which have neither been discussed in prior literature nor revealed by exploratory research. The constructlevel error term represents these missing causes. This means that the more comprehensive the set of formative indicators specified for the construct, the smaller the influence of the error term. Williams et al. (2003, p. 908) note in this context that “as the variance of the residual increases, the meaning of the construct becomes progressively ambiguous”. 5.3. Estimation of formative models 5.3.1. Multicollinearity Multicollinearity is an undesirable property in formative models as it causes estimation difficulties (Albers and Hildebrandt, 2006; Diamantopoulos and Winklhofer, 2001). These estimation problems arise because a multiple regression links the formative indicators to the construct (see Eq. (2)). Substantial correlations among formative indicators result in unstable estimates for the indicator coefficients γi and it becomes difficult to separate the distinct influence of individual indicators on the latent variable η. Diamantopoulos and Winklhofer (2001) further note that multicollinearity leads to difficulties in assessing indicator validity on the basis of the magnitude of the parameters γi (Bollen, 1984; MacKenzie et al., 2005). Literature proposes different approaches for dealing with multicollinearity. Bollen and Lennox (1991) argue that indicators which highly intercorrelate are almost perfect linear combinations and thus quite likely contain redundant information. Based on this view, several authors (e.g., Diamantopoulos and Winklhofer, 2001; Götz and Liehr-Gobbers, 2004) suggest indicator elimination based on the variance inflation factor (VIF), which assesses the degree of multicollinearity. Some empirical studies on formative measure development (e.g., Diamantopoulos and Siguaw, 2006; Helm, 2005; Sánchez-Pérez and Iniesta-Bonillo, 2004; Witt and Rode, 2005) follow this advice usually by applying the commonly accepted cut-off value of VIF N 10 or its tolerance equivalent (see Giere et al., 2006; Hair, Anderson, Tatham and Black, 1998; Kennedy, 2003). However, considering that this multicollinearity check leads to indicator elimination on purely statistical grounds and given the danger of altering the meaning of the construct by excluding indicators (Bollen and Lennox, 1991), “[i]ndicator elimination – by whatever means – should not be divorced from conceptual considerations when a formative measurement model is involved” (Diamantopoulos and Winklhofer, 2001, p. 273). Albers and Hildebrandt (2006) put forward a different approach for overcoming multicollinearity by combining formative indicators into an index (using either an arithmetic or geometric mean) and using the latter as a single-item construct in the subsequent analysis. However, although intuitively appealing, this suggestion raises two important questions. First, what is the interpretation of the joint index of two indicators in terms of its substantial meaning? If, for example, income and age show a high intercorrelation (which appears to be a likely assumption) and their measures are

consequently combined into an index, what exactly does this index capture? Second, having included this index into Eq. (2), what kind of information does its corresponding regression parameter estimate provide? Is it the impact of a joint unit change in both, the income and the age? 5.3.2. Exogenous variable intercorrelations One general issue when specifying measurement models is the specification of inter-indicator correlations. In reflective models, a common approach is to free all covariances among exogenous variables allowing for intercorrelations. In formative models, following this strategy leads to a large number of additional parameters, namely correlation estimates of covariances between (a) formative indicators within a construct, (b) formative indicators between constructs, and (c) exogenous latent constructs (Bollen and Lennox, 1991; MacCallum and Browne, 1993). Bollen and Lennox (1991) recommend allowing for intercorrelation of formative measures which relate to the same construct (without, however, expecting any specific pattern). Furthermore, they argue that for both reflectively and formatively-measured constructs it is likely, but though not necessary, that item correlations within constructs exceed item correlations between constructs. Based on this argument, MacCallum and Browne (1993) consider two possible approaches of specifying correlations. The first approach specifies formative indicators of the same construct to be correlated with each other but uncorrelated with indicators of other constructs. The obvious advantage of this procedure is that it retains model parsimony as no nonhypothesized paths are added. The obtained goodness-of-fit indices are hence solely based on the hypothesized relationships, that is, the relationships of interest. The shortcoming of this approach, however, is that the fixing of covariances to zero leads to blocks of zeros in the implied covariance matrix. These zero covariances assume that the corresponding indicators and/ or latent variables are perfectly uncorrelated. MacCallum and Browne (1993) note that this assumption implies substantive meaning for the model which requires theoretical justification. They therefore refrain from recommending this approach. Jarvis et al. (2003) further argue that any common cause of the concerned variables that is not incorporated in the model contributes to a lack of model fit. Consequently, they also conclude that fixing covariances to zero is an inappropriate method. The second approach specifies formative indicators to be correlated with each other as well as with indicators of other constructs or exogenous variables. The major advantage of this method is that all variables are allowed to covary instead of assuming complete independence which is theoretically not justifiable. This approach, however, also raises a number of problematic issues. First, the number of parameters to be estimated increases, thereby decreasing the number of degrees of freedom. Second, MacCallum and Browne (1993) empirically show that the additional parameters provide little explanatory value. Consequently, models lack parsimony without providing substantive meaning in explaining inter-measure

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

relationships. Furthermore, the estimates for unhypothesized parameters influence the overall model fit, even though they are not of interest. Despite these shortcomings, MacCallum and Browne (1993) recommend this method compared to the option of having zero blocks in the implied covariance matrix. Jarvis et al. (2003) agree on that fact but stress that locating the impact of the non-hypothesized parameter estimates on the model fit is necessary. They suggest estimating a series of nested models, that is, freeing parameters step by step and comparing the overall model fit across steps. 5.3.3. Model identification A major concern of formative measurement models is how to establish statistical identification to enable their estimation. In isolation, formatively-measured constructs as defined by Eq. (2) are underidentified (Bollen and Lennox, 1991; MacCallum and Browne, 1993; Temme, 2006) and, thus, cannot be estimated. This inability to estimate formative measurement models without the introduction of additional information (see below) has resulted in criticisms of the value of formative measurement in general (see Howell et al., 2007, 2008-this issue). As with reflective measurement models, two necessary but yet not sufficient conditions have to be met for identifying models including formatively-measured constructs (Bollen, 1989; Bollen and Davis, 1994; Cantallupi, 2002a,b; Edwards, 2001; Temme, 2006). First, the number of non-redundant elements in the covariance-matrix of the observed variables needs to be greater or equal the number of unknown parameter in the model (t-rule). Second, the latent construct needs to be scaled (scaling rule). For the latter condition, three main options are available (Bollen and Davis, 1994; MacCallum and Browne, 1993), namely (a) fixing a path from a formative indicator to the construct, (b) fixing a path from the formatively-measured construct to a reflectivelymeasured endogenous latent variable, or (c) standardizing the formatively-measured construct by fixing its variance to unity. Edwards (2001) advises the last of the three options because fixing path parameters precludes estimating standard errors of theoretically interesting relationships. Note that the choice of scaling method can affect substantive conclusions as the significance of different relationships in the model with a formatively-measured construct may vary depending of how the scale of the latter is set (see Franke et al., 2008-this issue). The t-rule and scaling rule are, however, not sufficient conditions for identifying formative measurement models. In this context, Bollen (1989) draws attention to the fact that the formative measurement model needs to be placed within a larger model that incorporates consequences (i.e., effects) of the latent variable in question to enable its estimation. Specifically, for identifying the disturbance term ζ at the construct level, the formative latent variable needs to emit at least two paths to other (reflective) constructs or indicators (MacCallum and Browne, 1993); literature also refers to this condition as 2+ emitted paths rule (Bollen and Davis, 1994). Literature discusses three approaches for applying the 2+ emitted paths rule, which are (a) adding two reflective indicators to the formatively-measured construct, (b) adding two reflectively-measured constructs as outcome variables, and (c) a mix-

1213

ture of these two approaches, that is, adding a single reflective indicator and a reflectively-measured construct as an outcome variable. 5.3.3.1. Adding two reflective indicators. The first option is adding two reflective measures to the set of formative indicators (see Fig. 4). Jarvis et al. (2003) and MacKenzie et al. (2005) advise this method based on the key arguments that (a) this approach does not require adding constructs to the model solely for identification purposes (which contributes to model parsimony), and (b) measurement parameters are stable and less sensitive to changes in structural parameters. However, this model allows for different conceptual interpretations (Jarvis et al., 2003), namely (a) a MIMIC model (Jöreskog and Goldberger, 1975), (b) an endogenous construct with two reflective indicators that is influenced by exogenous observed variables, or (c) a formatively-measured construct which influences indicators of another construct. MacKenzie et al. (2005) argue that the constellation resulting from adding two reflective measures to the formative specification should not be interpreted as a MIMIC model but as a latent variable having a mixture of formative and reflective indicators (since both types of indicators belong to the same concept domain and are content-valid operationalizations of the same construct). In contrast, MacCallum and Browne (1993), Scholderer and Balderjahn (2006) and Temme (2006) explicitly equate models with mixed indicators and MIMIC models. It is outside the scope of this paper to discuss these interpretations in detail, but it should be stressed that despite the different possible interpretations at a conceptual level, there are no differences at the empirical level (the models yield the same parameter estimates).

Fig. 4. Identification using a MIMIC model.

1214

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Fig. 5. Identification with two reflectively-measured constructs.

5.3.3.2. Adding two reflective constructs. According to Bollen and Davis (1994), another option of establishing model identification is the specification of two structural relations from the formative latent variable to two reflectively-measured constructs (Fig. 5). While these two reflectively-measured constructs need to be unrelated in models which only comprise the focal formatively-measured construct and the two reflectivelymeasured constructs (as in Fig. 5), the reflective constructs may be causally related in larger models (Temme, 2006). This model is justifiable in case that two reflectivelymeasured constructs can be included in the nomological network based on theoretical considerations. However, including any reflectively-measured outcome variables purely for identification reasons puts the theoretical model specification into question if these outcomes are not of theoretical interest. Note also that the choice of outcome constructs potentially affects the interpretation of the formatively-measured construct itself by influencing the estimates of γ-parameters (see Heise, 1972; Howell et al., 2007; and also Franke et al., 2008-this

issue). Indeed, as Bagozzi (2007, p. 236) observes, “the parameters regarding the observed variables to their purported formative latent variable are functions of the number and nature of endogenous latent variables and their measures”. 5.3.3.3. Adding one reflective indicator and one reflective construct. This model is a mixture of the two previous procedures and involves adding one reflective indicator to the latent construct and linking the latter to a reflectively-measured latent variable (Fig. 6). This mixed approach is applicable if the theoretical model includes only one structural relationship of the formatively-measured latent variable to a reflectivelymeasured latent variable. In this case, including a reflective indicator such as a global measure helps to overcome underidentification and might simultaneously be used for validation purposes (see Diamantopoulos and Winklhofer, 2001). Temme (2006) demonstrates that the 2+ emitted paths rule is a necessary but yet not necessarily sufficient condition for identification when the two reflectively-measured outcome

Fig. 6. Identification with one reflective measure and one reflectively-measured construct.

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

constructs are either directionally related (i.e., one directly impacts on the other), or their disturbance terms are correlated. These models require imposing further restrictions in order to establish full model identification (such as fixing the covariance of the disturbance terms to zero, or using a partially reduced form model; for details, see Bollen and Davis, 1994; Cantaluppi, 2002a,b; and Temme, 2006). Finally, models which violate the 2+ emitted paths rule due to containing formatively-measured constructs that emit only one path can be identified by fixing the variance of the disturbance term to zero (MacCallum and Browne, 1993). MacCallum and Browne (1993) alert to applying this approach with caution as it implies the theoretical assumption that the formative indicators completely capture the construct. In other words, this approach assumes that a census of indicators of the latent variable is undertaken at the item generation stage, and, hence, no unexplained variance exists. By fixing the disturbance term to zero the formative construct becomes a weighted linear combination of its indicators without any surplus meaning (Diamantopoulos, 2006; MacKenzie et al., 2005). Although there are examples of constructs for which all possible indicators could be conceivably specified (Diamantopoulos, 2006), in most cases this assumption is not reasonable (Bollen and Davis, 1994) and therefore setting the error term to zero is not justifiable. Finally, like all formative measurement models, also the three higher-order models in Fig. 2 are in isolation statistically underidentified and cannot be estimated. Since a discussion of necessary conditions for identifying higher-order models is beyond the scope of this review, the reader is referred to Albers and Götz (2006), Cantaluppi (2002a,b), Edwards (2001), Giere et al. (2006), Jarvis et al. (2003), Temme (2006), and Williams et al. (2003). 5.4. Reliability and validity assessment of formative models 5.4.1. Reliability assessment As the correlations between formative indicators may be positive, negative or zero (Bollen, 1984; Diamantopoulos and Winklhofer 2001), reliability in an internal consistency sense is not meaningful for formative indicators (Bagozzi, 1994; Hulland, 1999). As Nunally and Bernstein (1994) put it, “internal consistency is of minimal importance because two variables that might even be negatively correlated can both serve as meaningful indicators of a construct”. Similarly, Bollen and Lennox (1991) explicitly alert researchers not to rely on correlation matrices for indicator selection as this might lead to eliminating valid measures. While Rossiter (2002, p. 388) condemns all sorts of reliability assessments claiming that “for a formed attribute, there is […] no question of unreliability” and several other authors skip the issue of reliability assessment when discussing formative measure development (e.g., Diamantopoulos and Winklhofer, 2001; Eggert and Fassot, 2003), Bagozzi (1994) and Diamantopoulos (2005) recommend reliability assessment for formative indicators in form of test-retest reliability (see e.g., DeVellis, 2003; Spector, 1992). MacKenzie et al. (2005) additionally propose using the correlation between formative

1215

indicators and an alternative measure assessing the focal construct. What needs to be clarified, however, is how such a correlation should be interpreted. Would a non-significant correlation unambiguously mean that the focal measure lacks reliability? What if the alternative measure is itself unreliable? Does this approach actually test the reliability of the focal measure or is it rather a test of convergent validity? 5.4.2. Validity assessment One of the most controversial issues in formative measurement literature is validity assessment. Some researchers argue that no quantitative quality checks are usable for assessing the appropriateness of formative indices (e.g., Homburg and Klarmann, 2006). Others note that the applicability of statistical procedures is limited as the choice of formative indicators determines the conceptual meaning of the construct (Albers and Hildebrandt, 2006). Rossiter (2002, p. 315) dismisses any validity assessment for formative indicators claiming that “all that is needed is a set of distinct components as decided by expert judgment”. However, most researchers do not share the above views. Edwards and Bagozzi (2000, p. 171), for example, stress that “if measures are specified as formative, their validity must still be established. It is bad practice to […] claim that one's measures are formative, and do nothing more”. 5.4.2.1. Individual indicator validity. Bollen (1989) argues that the γ-parameters, which reflect the impact of the formative indicators on the latent construct (see Eq. (2)), indicate indicator validity. The γ-parameters capture the contribution of the individual indicator to the construct, therefore items with nonsignificant γ-parameters should be considered for elimination as they cannot represent valid indicators of the construct (assuming that multicollinearity is not an issue). Diamantopoulos and Winklhofer (2001) build upon Bollen's (1989) argument and recommend using a MIMIC model due to simultaneously allowing for the estimation of γ-parameters and for the provision of an overall model fit (which is indicative of the validity of formative indicators as a set). An alternative (or additional) approach is assessing indicator validity by estimating the indicators' correlations with an external variable. For example, Diamantopoulos and Winklhofer (2001) suggest including a global measure summarizing the essence of the construct (see also Fayers et al., 1997). Assuming that the overall measure is a valid criterion, the relationship between a formative indicator and the overall measure indicates indicator validity (Eggert and Fassot, 2003; MacKenzie et al., 2005). Following this approach, indicators correlating highly with the external variable are retained whereas those showing low or nonsignificant relationships are candidates for elimination. Lastly, a formative measurement model specification implies that the latent variable completely mediates the effects of its indicators on other (outcome) variables (see Figs. 4 and 5). This implies certain proportionality constraints on the model coefficients (Bollen and Davis, 1994; Hauser, 1973). If such proportionality constraints do not hold for a particular indicator, the validity of the latter is questionable (see Franke et al., 2008this issue).

1216

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

5.4.2.2. Construct validity. After examining validity at the individual indicator level, the next step involves assessing validity at the overall construct level. An important point in this regard is that “causal indicators are not invalidated by low internal consistency, so to assess validity we need to examine other variables that are effects of the latent variable” (Bollen and Lennox, 1991, p. 312, emphasis added). One common approach is focusing on nomological (Jarvis et al., 2003; MacKenzie et al., 2005; Reinartz et al., 2004) and criterionrelated (Diamantopoulos and Siguaw, 2006; Edwards, 2001; Jarvis et al., 2003) validity. MacKenzie et al. (2005) suggest proceeding as with reflective scales, that is, estimating hypothesized relationships of the focal construct with theoretically related constructs. These estimated relationships should be consistent with the expected direction and be significantly different from zero. Diamantopoulos and Winklhofer (2001) also underline the importance of nomological validation particularly in cases where indicators have been purified. Rossiter (2002, p. 327) challenges the approach of evaluating the validity of a formative index by relating it to other constructs by arguing that “[a] scale's validity should be established independently for the construct”. In response, Diamantopoulos (2005) points out that, by definition, all forms of validity — with the exception of face and content validity — are defined in terms of relationships with other measures (see Carmines and Zeller, 1979; Zeller and Carmines, 1980). Concerning other types of validity assessments, Bagozzi (1994, p. 338) states that “construct validity in terms of convergent and discriminant validity [is] not meaningful when indexes are formed as linear sums of measurement”. In contrast, MacKenzie et al. (2005) suggest that standard procedures for assessing discriminant validity are equally applicable to formative indexes, which include testing (a) whether the focal construct less than perfectly correlates with related constructs, and/or (b) whether it shares less than half of its variance with some other construct, that is, construct intercorrelation is less than .71 (Fornell and Larcker, 1981). Diamantopoulos (2006) proposes using the variance of the error term as an indication of construct validity. Since the error term captures aspects of the construct's domain that the set of indicators neglect, the lower the variance of the error term, the more valid the construct (see also Williams et al., 2003). If the set of indicators is comprehensive in the sense that it includes all important construct facets, the construct meaning is validly captured; accordingly, the residual variance is likely to be small. Finally, confirmatory tetrad analysis (CTA) (Bollen and Ting, 1993, 2000; Eberl, 2006, Gudergan et al., 2008-this issue) offers a basic test of construct validity. Although Bollen and Ting (2000, p. 4) originally propose CTA as “an empirical test of whether a causal or effect indicator specification is appropriate”, interpreting evidence supporting the latter as also supporting the construct's validity is reasonable. 6. Conclusion and future research Building on a review of literature relating to the specification, estimation, and validation of formative measurement

models, this article hopefully lends a helping hand to researchers considering the adoption of formative measurement in their empirical efforts, while, at the same time, encouraging a critical perspective in the application of formative indicators. Concerning future research, one major issue concerns the conceptual plausibility of formatively-measured constructs occupying endogenous positions in structural models. While a number of studies incorporate formative latent variables in such positions (e.g., Edwards, 2001; Jarvis et al., 2003; MacKenzie et al., 2005), Wiley (2005, p. 124, emphasis in original) notes that there is “no mechanism by which an antecedent variable can influence a formative index”. Since the set of causal indicators and the disturbance term jointly account for the total variation of a formatively-measured construct, the specification of an additional source of variation (i.e., an antecedent construct) is conceptually questionable. Given the conceptual and practical importance of this issue, a debate on the use of formatively-measured constructs as endogenous variables is urgently required. Another issue for future research concerns modeling formatively-measured constructs as moderator variables in structural models. Although literature provides empirical examples of employing formatively specified moderators (e.g., Reinartz et al., 2004), more research using formatively-measured constructs when forming interaction terms is needed. Finally, there is a debate on whether formative measurement is really necessary, that is, whether it should be used in the first place. Bagozzi (2007, p. 236), for example, states that, formative measurement can be done but only for a limited range of cases and under restrictive assumptions”, while Howell et al. (2007, p. 216; see also Howell et al., 2008-this issue) argue that “formative measurement is not an equally attractive alternative [to reflective measurement]”. Although there are those (including the authors and Bollen, 2007) who feel that, despite its various shortcomings, formative measurement is indeed a viable alternative to reflective measurement based on conceptual grounds, further theoretical and methodological research is necessary to finally settle this debate. Time will tell. References Albers S, Götz O. Messmodelle mit Konstrukten zweiter Ordnung in der betriebswirtschaftlichen Forschung. Betriebswirtschaft 2006;66(6):669–77. Albers S, Hildebrandt L. Methodische Probleme bei der Erfolgsfaktorenforschung — Messfehler, formative versus reflective Indikatoren und die Wahl des Strukturgleichungs-Modells. Zfbf 2006;58:2–33. Anderson J, Gerbing D. Some methods for respecifying measurement models to obtain unidimensional construct measurement. J Mark Res 1982;19(4): 453–60. Arnett DB, Laverie DA, Meiers A. Developing parsimonious retailer equity indexes using partial least squares analysis: a method and applications. J Retail 2003;79:161–70. Bagozzi RP. A prospectus for theory construction in marketing. J Mark 1984;48:11–29. Bagozzi RP. Structural equation models in marketing research: basic principles. In: Bagozzi RP, editor. Principles of marketing research. Oxford: Blackwell; 1994. p. 317–85. Bagozzi RP. On the meaning of formative measurement and how it differs from reflective measurement: comment on Howell, Breivik, and Wilcox. Psychol Methods 2007;12(2):229–37.

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218 Bagozzi RP, Heatherton TF. A general approach to representing multifaceted personality constructs: application to state self-esteem. Struct Equ Modeling 1994;1(1):35–67. Blalock HM. Causal inferences in nonexperimental research. Chapel Hill: University of North Carolina Press; 1964. Blalock HM. Theory building and causal inferences. In: Blalock HM, Blalock A, editors. Methodology in social research. New York: McGraw-Hil; 1968. p. 155–98. Blalock HM. Causal models involving unobserved variables in stimulusresponse situations. In: Blalock HM, editor. Causal models in the social sciences. Chicago: Aldine; 1971. p. 335–47. Bollen K. Multiple indicators: internal consistency or no necessary relationship? Qual Quant 1984;18:377–85. Bollen K. Structural equations with latent variables. New York: Wiley; 1989. Bollen K, Lennox R. Conventional wisdom on measurement: a structural equation perspective. Psychol Bull 1991;110(2):305–14. Bollen K. Latent variables in psychology and the social sciences. Annu Rev Psychol 2002;53:605–34. Bollen K. Interpretational confounding is due to misspecification, not to type of indicator: comment on Howell, Breivik, and Wilcox. Psychol Methods 2007;12(2):219–28. Bollen K, Ting K. Confirmatory tetrad analysis. In: Marsden PV, editor. Sociological methodology. Washington, D.C: American Sociological Association; 1993. p. 147–75. Bollen K, Davis W. Causal indicator models: identification, estimation, and testing. Paper presented at the American Sociological Association Convention, Miami; 1994. Bollen K, Ting K. A tetrad test for causal indicators. Psychol Methods 2000;5(1): 3–22. Brewer P. Operationalizing psychic distance: a revised approach. J Int Mark 2007;15(1):44–66. Brock JK, Zhou Y. Organizational use of the internet — scale development and validation. Internet Res 2005;15(1):67–87. Bruhn M, Georgi D, Hadwich K. Customer equity management as a formative second-order construct. Journal of Business Research 2008;61:1292–301 (this issue). doi:10.1016/j.jbusres.2008.01.016. Cadogan JW, Souchon AL, Procter DB. The quality of market-oriented behaviors: formative index construction and validation. Journal of Business Research 2008;61:1263–77 (this issue). doi:10.1016/j.jbusres.2008.01.014. Cantaluppi G. Some further remarks on parameter identification of structural equation models with both formative and reflexive relationships. In: A.A., V.V., editors. Studi in onore di Angelo Zanella. Milano: Vita e Pensiero; 2002a. p. 89–104. Cantaluppi G. The problem of parameter identification of structural equation models with both formative and reflexive relationships: some theoretical results. Serie Edizioni Provvisorie 2002b; No. 108, Istituto di Statistica, Università Cattolica del S. Cuore, Milano:1–19. Carmines EG, Zeller RA. Reliability and validity assessment. In: Sullivan JL, editor. Quantitative applications in the social sciences. Beverly Hills: Sage; 1979. Churchill GA. A paradigm for developing better measures of marketing constructs. J Mark Res 1979;16:64–73. Cohen P, Cohen J, Teresi J, Marchi M, Velez CN. Problems in the measurement of latent variables in structural equations causal models. Appl Psychol Meas 1990;14:183–96. Collier JE, Bienstock CC. Measuring service quality in e-retailing. J Serv Res 2006;8(3):260–75. Curtis RF, Jackson EF. Multiple indicators in survey research. Am J Sociol 1962;68:195–204. DeVellis Robert F. Scale development — theories and applications. Applied social research methods series. 2nd edition. Sage Publications; 2003. Diamantopoulos A. Viewpoint: export performance measurement: reflective versus formative indicators. Int Mark Rev 1999;16(6):444–57. Diamantopoulos A. The C-OAR-SE procedure for scale development in marketing: a comment. Int J Res Mark 2005;22:1–9. Diamantopoulos A. The error term in formative measurement models: interpretations and modelling implications. J Modell Manage 2006;1(1):7–17. Diamantopoulos A, Winklhofer H. Index construction with formative indicators: an alternative to scale development. J Mark Res 2001;38(2):269–77.

1217

Diamantopoulos A, Siguaw J. Formative versus reflective indicators in organizational measure development: a comparison and empirical illustration. Br J Manage 2006;17(4):263–82. Dowling G. Journalists' evaluation of corporate reputations. Corp Reputation Rev 2004;7(2):196–205. Duncan OD. Notes on social measurement: historical and critical. New York: Russell Sage; 1984. Eberl M. Formative und reflektive Konstrukte und die Wahl des Strukturgleichungsverfahrens. Betriebswirtschaft 2006;66(6):651–68. Eberl M, Schwaiger M. Corporate reputation: disentangling the effects on financial performance. Eur J Mark 2005;39:838–54. Edwards JR. Multidimensional constructs in organizational behavior research: an integrative analytical framework. Organ Res Methods 2001;4(2): 144–92. Edwards JR, Bagozzi R. On the nature and direction of relationships between constructs and measures. Psychol Methods 2000;5(2):155–74. Eggert A, Fassot G. Zur Verwendung formativer and reflektiver Indikatoren in Strukturgleichungsmodellen. Kaiserslaut Schr reihe Mark 2003;20:1–18. Fassot G. Operationalisierung latenter Variablen in Strukturgleichungsmodellen: Eine Standortbestimmung. Zfbf 2006;58:67–88. Fassot A, Eggert G. Zur Verwendung formativer und reflektiver Indikatoren in Strukturgleichungsmodellen: Bestandaufnahme und Anwendungsempfehlung. In: Bliemel FW, Eggert A, Fassot G, Henseler J, editors. Handbuch PLS-Modellierung, Methode, Anwendung, Praxisbeispiele. Stuttgart: Schaeffer-Poeschel; 2005. p. 31–47. Fayers PM, Hand DJ, Bjordal K, Groenvold M. Causal indicators in quality of life research. Qual Life Res 1997;6:393–406. Fornell C, Larcker DF. Evaluating structural equation models with unobservable variables and measurement error. J Mark Res 1981;18:39–50. Fornell C, Bookstein FL. A comparative analysis of two structural equation models: LISREL and PLS applied to market data. In: Fornell C, editor. A second generation of multivariate analysis, vol. 1. New York: Praeger; 1982. p. 289–324. Fornell C, Rhee BD, Yi Y. Direct regression, reverse regression, and covariance structure analysis. Mark Lett 1991;2(3):309–20. Franke G, Preacher C, Rigdon E. The proportional structural effects of formative indicators. Journal of Business Research 2008;61:1229–37 (this issue). doi:10.1016/j.jbusres.2008.01.011. Gardner D, Cummings L, Dunham R, Pierce J. Single-item versus multipleitem measurement scales: an empirical comparison. Educ Psychol Meas 1998;58(6):898–915. Giere J, Wirtz B, Schilke O. Mehrdimensionale Konstrukte: Konzeptionelle Grundlagen und Möglichkeiten ihrer Analyse mithilfe von Strukturgleichungsmodellen. Betriebswirtschaft 2006;66(6):678–95. Götz O, Liehr-Gobbers K. Analyse von Strukturgleichungsmodellen mit Hilfe der Partial-Least-Squares(PLS)-Methode. Betriebswirtschaft 2004;64(6): 714–38. Gudergan SP, Ringle CM, Wende S, Will A. Confirmatory tetrad analysis for evaluating the mode of measurement models in PLS path modeling. Journal of Business Research 2008;61:1238–49 (this issue). Hair JF, Anderson RE, Tatham RL, Black WC. Multivariate data analysis. New Jersey: Prentice Hall; 1998. Hauser RM. Diaggregating a social-psychological model of educational attainment. In: Goldberger AS, Duncan OD, editors. Structural equation models in the social sciences. San Diego: Academic Press; 1973. p. 255–89. Hauser RM, Goldberger AS. The treatment of unobservable variables in path analysis. Sociol Method 1971:81–117. Heise DR. Employing nominal variables, induced variables, and block variables in path analysis 1972; 1:147–173. Helm S. Designing a formative measure for corporate reputation. Corp Reputation Rev 2005;8(2):95–109. Hitt MA, Gimeno J, Hoskisson RE. Current and future research methods in strategic management. Organ Res Methods 1998;1:6–44. Homburg C, Klarmann M. Die Kausalanalyse in der empirischen betriebswirtschaftlichen Forschung – Problemfelder und Anwendungsempfehlungen. Betriebswirtschaft 2006;66(6):727–48. Homburg C, Workman JP, Krohmer H. Marketing's influence within the firm. J Mark 1999;63(2):1–17.

1218

A. Diamantopoulos et al. / Journal of Business Research 61 (2008) 1203–1218

Homburg C, Hoyer W, Fassnacht M. Service orientation of a retailer's business strategy: dimensions, antecedents, and performance outcomes. J Mark 2002;66(4):86–101. Howell RD, Breivik E, Wilcox JB. Reconsidering formative measurement. Psychol Methods 2007;12(2):205–18. Howell RD, Breivik K, Wilcox JB. Questions about formative measurement. Journal of Business Research 2008;61:1219–28 (this issue). doi:10.1016/j. jbusres.2008.01.010. Hulland J. Use of partial least squares (PLS) in strategic management research: a review of four recent studies. Strateg Manage J 1999;20:195–204. Hyman M, Ganesh G, McQuitty S. Augmenting the household influence construct. J Mark Theory Pract 2002;10(3):13–31. Jarvis C, MacKenzie S, Podsakoff PA. Critical review of construct indicators and measurement model misspecification in marketing and consumer research. J Consum Res 2003;30(2):199–218. Johansson JK, Yip GS. Exploiting globalization potential: U.S. and Japanese strategies. Strateg Manage J 1994;15(8):579–601. Johnson GJ, Bruner II GC, Kumar A. Interactivity and its facets revisited. J Advert 2006;35(4):35–52. Jöreskog K, Goldberger A. Estimation of a model with multiple indicators and multiple causes of a single latent variable. J Am Stat Assoc 1975;10:631–9. Judge TA, Bretz RD. Person-organization fit and the theory of work adjustment: implications for satisfaction, tenure, and career success. J Vocat Behav 1994;44(1):32–54. Kennedy PA. Guide to econometrics. 5th edition. Boston: MIT Press; 2003. Land K. On estimation of path coefficients for unmeasured variables from correlations among observed variables. Soc Forces 1970;48:506–11. Law K, Wong C. Multidimensional constructs in structural equation analysis: an illustration using the job perception and job satisfaction constructs. J Manage 1999;25(2):143–60. Law KS, Wong CS, Mobley WH. Toward a taxonomy of multidimensional constructs. Acad Manage Rev 1998;23(4):741–55. Lin CH, Sher PJ, Shih HY. Past progress and future directions in conceptualizing customer perceived value. Int J Serv Ind Manag 2005;16(4):318–36. Long JS. Confirmatory factor analysis: a preface to LISREL. Bloomington, IN: Sage Publications; 1983. Lord FM, Novick MR. Statistical theories of mental test scores. Reading, MA: Addison-Wesely; 1968. MacCallum R, Browne M. The use of causal indicators in covariance structure models: some practical issues. Psychol Bull 1993;114(3):533–41. MacKenzie SB. The danger of poor construct conceptualization. J Acad Mark Sci 2003;31(3):323–6. MacKenzie S, Podsakoff P, Jarvis C. The problem of measurement model misspecification in behavioural and organizational research and some recommended solutions. J Appl Psychol 2005;90(4):710–30. Namboodiri NK, Carter LF, Blalock HM. Applied multivariate analysis and experimental designs. New York: McGraw-Hill; 1975. Netemeyer RG, Bearden WO, Sharma S. Scaling procedures. CA: Sage; 2003. Nunnally JC. Psychometric theory. 2nd edition. New York: McGraw-Hill; 1978. Nunnally JC, Bernstein IH. Psychometric theory. 3rd edition. New York: McGraw-Hill; 1994. Pavlou P, Gefen D. Psychological contract violation in online marketplaces: antecedents, consequences, and moderating role. Inf Syst Res 2005;16(4): 372–99. Podsakoff PM, MacKenzie SB, Podsakoff NP, Lee JY. The mismeasure of man (agement) and its implications for leadership research. Leadersh Q 2003;14:615–56.

Podsakoff NP, Shen W, Podsakoff PM. The role of formative measurement models in strategic management research: review, critique, and implications for future research. Res Methodol Strat Manag 2006;3:197–252. Reinartz W, Krafft M, Hoyer WD. The customer relationship management process: Ist measurement and impact on performance. J Mark Res 2004;41(3): 293–305. Rossiter J. The C-OAR-SE procedure for scale development in marketing. Int J Res Mark 2002;19:305–35. Ruiz DM, Gremler DD, Washburn JH, Capeda-Carrion G. Service value revisited: specifying a higher-order, formative measure. Journal of Business Research 2008;61:1278–91 (this issue). doi:10.1016/j.jbusres.2008.01.015. Sànchez-Pérez M, Iniesta-Bonillo M. Consumers felt commitment towards retailers: index development and validation. J Bus Psychol 2004;19(2): 141–59. Santosa PI, Wei KK, Chan HC. User involvement and user satisfaction with information-seeking activity. European Journal of Information Systems: including a special section on the Pacific Asia conference, vol. 14(4). ; 2005. p. 361–70. Scholderer J, Balderjahn I. Was unterscheidet harte und weiche Strukturgleichungsmodelle nun wirklich? Mark ZFP 2006;28(1):57–70. Spector PE. Summated rating scale construction: an introduction. Series: quantitative applications in the social sciences. CA: Sage Publications; 1992. Temme D. Die Spezifikation und Identifikation formativer Messmodelle der Marketingforschung in Kovarianzstrukturanalysen. Mark ZFP 2006;28(3): 183–209. Ulaga W, Eggert A. Value-based differentiation in business relationships: gaining and sustaining key supplier status. J Mark 2006;70(1):119–36. Venaik S, Midgley DF, Devinney TM. A new perspective on the integrationresponsiveness pressures confronting multinational firms. Manag Int Rev 2004;44(Special Issue 2004/1):15–48. Venaik S, Midgley DF, Devinney TM. Dual paths to performance: the impact of global pressures on MNC subsidiary conduct and performance. J Int Bus Stud 2005;36(6):655–75. Wiley J. Reflections on formative measures: conceptualization and implication for use. ANZMAC Conference, Perth; December 5–7; 2005. Williams LJ, Edwards JR, Vandenberg RJ. Recent advances in causal modeling methods for organizational and management research. Journal of Management 2003;29(6):903–36. Williams LJ, Gavin MB, Hartman NS. In: Ketchen DJ, Berg DD, editors. Structural equation modeling methods in strategy research: applications and issues. Research Methodology in Strategy ManagementBoston, MA: Elsevier; 2004. p. 303–46. Winklhofer H, Diamantopoulos A. Managerial evaluation of sales forecasting effectiveness: a MIMIC model approach. Int J Res Mark 2002;19:151–66. Witt P, Rode V. Corporate brand building in start-ups. J Enterp Cult 2005;13(3): 273–94. Yi MY, Davis FD. Developing and validating an observational learning model of computer software training and skill acquisition. Inf Syst Res 2003;14(2): 146–69. Zeller RA, Carmines EG. Measurement in the social sciences — the link between theory and data. CA: Cambridge University Press; 1980.