AMOS for Beginners

143 downloads 52929 Views 3MB Size Report
supported by sample data or the model fits the data well. .... How SEM programs estimate the parameters? .... A software used for data analysis known as.
By Hui Bian Office For Faculty Excellence Spring 2012

1



What is structural equation modeling (SEM)  Used to test the hypotheses about potential interrelationships among the constructs as well as their relationships to the indicators or measures assessing them. Theory of planned behavior (TPB)

2



Goals of SEM  To determine whether the theoretical model is supported by sample data or the model fits the data well.  It helps us understand the complex relationships among constructs.

3

error1

Indica1

error2

Indica2

error3

Indica3

error4

Indica4

error5

Indica5

error6

Indica6

Factor1

Factor2

Example of SEM 4

Measurement model Example of SEM

Structural model

Measurement model

5



Basic components of SEM  Latent variables (constructs/factors)  

Are the hypothetical constructs of interest in a study, such as: selfcontrol, self-efficacy, intention, etc. They cannot be measured directly.

 Observed variables (indicators)  



Are the variables that are actually measured in the process of data collection by the researchers using developed instrument/test. They are used to define or infer the latent variable or construct. Each of observed variables represents one definition of the latent variable. 6



Basic components of SEM  Endogenous variables (dependent variables): variables have at least one arrow leading into it from another variable.  Exogenous variables (independent variables): any variable that does not have an arrow leading to it.

7



Basic components of SEM  Measurement error terms 

Represents amount of variation in the indicator that is due to measurement error.

 Structural error terms or disturbance terms 

Unexplained variance in the latent endogenous variables due to all unmeasured causes.

8



Basic components of SEM  Covariance: is a measure of how much two variables change together.  We use two-way arrow to show covariance.

9



Graphs in AMOS    

Rectangle represents observed variable Circle or eclipse represents unobserved variable Two-way arrow: covariance or correlation One-way arrow: unidirectional relationship

10

Covariance Observed variable

Latent variable

Path

Measurement Error terms

Latent variable

Structural Error term

11



Model parameters  Are those characteristics of model unknown to the researchers.  They have to be estimated from the sample covariance or correlation matrix.

12



Model parameters     

Regression weights/Factor loadings Structural Coefficient Variance Covariance Each potential parameter in a model must be specified to be fixed, free, or constrained parameters

13



Model parameters  Free parameters: unknown and need to be estimated.  Fixed parameters: they are not free, but are fixed to a specified value, either 0 or 1.  Constrained parameters: unknown, but are constrained to equal one or more other parameters.

14

Free

Fixed If opp_v1 = opp_v2, they are constrained parameters

15



Build SEM models  Model specification: is the exercise of formally stating a model. Prior to data collection, develop a theoretical model based on theory or empirical study, etc.  Which variables are included in the model.  How these variables are related.  Misspecified model: due to errors of omission and/or inclusion of any variable or parameter.

16



Model identification: the model can in theory and in practice be estimated with observed data.

 Under-identified model: if one or more parameters may not be uniquely determined from observed data. A model for which it is not possible to estimate all of the model's parameters.

17



Model identification  Just-identified model(saturated model): if all of the parameters are uniquely determined. For each free parameter, a value can be obtained through only one manipulation of observed data.  The degree of freedom is equal to zero (number of free parameters exactly equals the number of known values). 

 Model fits the data perfectly. Over-identified model: A model for which all the parameters are identified and for which there are more knowns than free parameters. 18

  



Just or over identified model is identified model If a model is under-identified, additional constraints may make model identified. The number of free parameters to be estimated must be less than or equal to the number of distinct values in the matrix S. The number of distinct values in matrix S is equal to p (p+1)/2, p is the number of observed variables. 19



How to avoid identification problems  To achieve identification, one of the factor loadings must be fixed to one. The variable with a fixed loading of one is called a marker variable or reference item.  This method can solve the scale indeterminacy problem.  There are "enough” indicators of each latent variable. A simple rule that works most of the time is that there need to be at least two indicators per latent variable and those indicators' errors are uncorrelated.  Use recursive model  Design a parsimonious model 20



Rules for building SEM model  All variances of independent variables are model parameters.  All covariances between independent variables are model parameters.  All factor loadings connecting the latent variables and their indicators are parameters.  All regression weights between observed or latent variables are parameters. 21



Rules for building SEM model  The variance and covariances between dependent variables and covariances between dependent and independent variables are NOT parameters.  *For each latent variable included in the model, the metric of its latent scale needs to be set.  For any independent latent variable: a path leaving the latent variable is set to 1.  *Paths leading from the error terms to their corresponding observed variables are assumed to be equal to 1. 22

23



Build SEM models: Model estimation  How SEM programs estimate the parameters?

 The proposed model makes certain assumptions about the relationships between the variables in the model.  The proposed model has specific implications for the variances and covariances of the observed variables. 24



How SEM programs estimate the parameters?  We want to estimate the parameters specified in the model that produce the implied covariance matrix ∑.  We want matrix ∑ is as close as possible to matrix S, sample covariance matrix of the observed variables.  If elements in the matrix S minus the elements in the matrix ∑ is equal to zero, then chi-square is equal to zero, and we have a perfect fit. 25



How SEM programs estimate the parameters?  In SEM, the parameters of a proposed model are estimated by minimizing the discrepancy between the empirical covariance matrix, S, and a covariance matrix implied by the model, ∑ . How should this discrepancy be measured? This is the role of the discrepancy function.  S is the sample covariance matrix calculated from the observed data.  ∑ is covariance matrix implied by the proposed model or the reproduced (or model-implied) covariance matrix is determined by the proposed model. 26



How SEM programs estimate the parameters?  In SEM, if the difference between S and ∑ (distance between matrices) is small, then one can conclude that the proposed model is consistent with the observed data.  If the difference between S and ∑ is large, one can conclude that the proposed model doesn’t fit the data.  The proposed model is deficient.  The data is not good. 27



Build SEM models  Model estimation

 Estimation of parameters.  Estimation process uses a particular fit function to minimize the difference between S and ∑.  If the difference = 0, one has a perfect model fit to the data.

28



Model estimation methods  The two most commonly used estimation techniques are Maximum likelihood (ML) and normal theory generalized least square (GLS).  ML and GLS: large sample size, continuous data, and assumption of multivariate normality  Unweighted least squares (ULS): scale dependent.  Asymptotically distribution free (ADF) (Weighted least squares, WLS): serious departure from normality. 29

Assume normality No normality assumed

30



Model testing  We want to know how well the model fits the data.  If S and ∑ are similar, we may say the proposed model fits the data.  Model fit indices.  For individual parameter, we want to know whether a free parameter is significantly different from zero.  Whether the estimate of a free parameter makes sense.

31



Chi-square test  Value ranges from zero for a saturated model with all paths included to a maximum for the independence model (the null model or model with no parameters estimated).

32



Build SEM models  Model modification  If the model doesn’t fit the data, then we need to modify the model .  Perform specification search: change the original model in the search for a better fitting model .

33



Goodness-of-fit tests based on predicted vs. observed covariances (absolute fit indexes)  Chi-square (CMIN): a non-significant χ2 value indicates S and Σ are similar. χ2 should NOT be significant if there is a good model fit.  Goodness-of-fit (GFI) and adjusted goodness-of-fit (AGFI). GFI measures the amount of variance and covariance in S that is predicted by Σ. AGFI is adjusted for the degree of freedom of a model relative to the number of variables. 34



Goodness-of-fit tests based on predicted vs. observed covariances (absolute fit indexes)  Root-mean-square residual index (RMR). The closer RMR is to 0, the better the model fit.  Hoelter's critical N, also called the Hoelter index, is used to judge if sample size is adequate. By convention, sample size is adequate if Hoelter's N > 200. A Hoelter's N under 75 is considered unacceptably low to accept a model by chisquare. Two N's are output, one at the .05 and one at the .01 levels of significance. 35



Information theory goodness of fit: absolute fit indexes.  Measures in this set are appropriate when comparing models using maximum likelihood estimation.  AIC,BIC,CAIC,and BCC.  For model comparison, the lower AIC reflects the better-fitting model. AIC also penalizes for lack of parsimony.  BIC: BIC is the Bayesian Information Criterion. It penalizes for sample size as well as model complexity. It is recommended when sample size is large or the number of parameters in the model is small. 36



Information theory goodness of fit: absolute fit indexes.  CAIC: an alternative to AICC, also penalizes for sample size as well as model complexity (lack of parsimony). The penalty is greater than AIC or BCC but less than BIC. The lower the CAIC measure, the better the fit.  BCC: It should be close to .9 to consider fit good. BCC penalizes for model complexity (lack of parsimony) more than AIC. 37



Goodness-of-fit tests comparing the given model with a null or an alternative model. 



CFI, NFI, NFI

Goodness-of-fit tests penalizing for lack of parsimony.  parsimony ratio (PRATIO), PNFI, PCFI

38



Scaling and normality assumption  Maximum likelihood and normal theory generalized least squares assume that the measured variables are continuous and have a multivariate normal distribution.  In social sciences, we use a lot of variables that are dichotomous or ordered categories rather than truly continuous.  In social sciences, it is normal that the distribution of observed variables departs substantially from multivariate normality. 39



Scaling and normality assumption  Nominal or ordinal variables should have at least five categories and not be strongly skewed or kurtotic.  Values of skewness and kurtosis are within -1 and + 1.

40



Problems of non-normality(practical implications)  Inflated χ2 goodness-of-fit statistics.  Make inappropriate modifications in theoretically adequate models.  Findings can be expected to fail to be replicated and contributing to confusion in research areas.

41

How to detect normality of observed data?  Screen the data before the data analysis to check the distributions.  Skewness and kurtosis: univariate normality.  AMOS provides normality results.

42



Solutions to nonnormality  The asymptotically distribution free (ADF) estimation: ADF produces asymptotically unbiased estimates of the χ2 goodness-of-fit test, parameter estimates, and standard errors.  Limitation: require large sample size.

43



Solutions to nonnormality  Unweighted least square (ULS): No assumption of normality and no significance tests available. Scale dependent.  Bootstrapping: it doesn’t rely on normal distribution.  Bayesian estimation: if ordered-categorical data are modeled.

44



Sample size (Rules of thumb)  

10 subjects per variable or 20 subjects per variable 250-500 subjects (Schumacker & Lomax, 2004)

45



Computer programs for SEM     

AMOS EQS LISERAL MPLUS SAS

46







AMOS is short for Analysis of MOment Structures. A software used for data analysis known as structural equation modeling (SEM). It is a program for visual SEM.

47



Path diagrams  They are the ways to communicate a SEM model.  They are drawing pictures to show the relationships among latent/observed variables.  In AMOS: rectangles represent observed variables and eclipses represent latent variables.

48



Examples of using AMOS tool bar to draw a diagram.  Example  Two latent variables: intention and self-efficacy  Four observed variables: intention01, intention02, self_efficacy01, and self_efficacy02  Five error terms

49



The model should be like this

50



Go to All programs from Start > IBM SPSS Statistics > IBM SPSS AMOS19 > AMOS Graphics

51

Latent variables

Observed variables

Tool bar

52

  

Draw observed variables use Rectangle Draw latent variables use ellipse Draw error terms use

53



Use duplicate objects to get another part of the model , then use Reflect

54

55



Open data: File

Data Files Click

Your file

56



Put observed variable names to the graphs  Go to View > Variables in Dataset  Then drag each variable to each rectangle

57



Put latent variables in the graph    

Put the mouse over one latent variable and right click Get this menu Click Object Properties Type Self-efficacy here

58





For error terms, double click the ellipse and get Object Property window. Constrain parameters: double click a path from Self-efficacy to Self-efficy01, type 1 for regression weight, then click Close.

Click

59



The data is from AMOS examples (IBM SPSS)  Attig repeated the study with the same 40 subjects after a training exercise intended to improve memory performance. There were thus three performance measures before training and three performance measures after training.

60



Draw diagram

61



Conduct analysis: Analyze > Calculate Estimates  Text output 1. Number of distinct sample moments: sample means, variances, and covariances (AMOS ignores means). We also use 4(4+1)/2 = 10. 2. Number of distinct parameters to be estimated: 4 variances and 6 covariances. 3. Degrees of freedom: number of distinct sample moments minus number of distinct parameters 62



Text output

There is no null hypothesis being tested for this example. The Chi-square result is not very interesting.

63







For hypothesis test, the chi-square value is a measure of the extent to which the data were incompatible with the hypothesis. For hypothesis test, the result will be positive degrees of freedom. A chi-square value of 0 indicates no departure from the null hypothesis. 64



Text output Minimum was achieved: this line indicates that Amos successfully estimated the variances and covariances. When Amos fails, it is because you have posed a problem that has no solution, or no unique solution (model identification problem). 65



Text output

1. Estimate means covariance: for example the covariance between recall1 and recall2 is 2.556. 2. S.E. means an estimate of the standard error of the covariance, 1.16. 3. C.R. is the critical ratio obtained by dividing the covariance estimate by its standard error. 4. For a significance level of 0.05, critical ratio that exceeds 1.96 would be called significant. This ratio is relevant to the null hypothesis that, the covariance between recall1 and recall2 is 0. 66



Text output 5. In this example, 2.203 is greater than 1.96, then the covariance between recall1 and recall2 is significantly different from 0 at the 0.05 level. 6. P value of 0.028 (two-tailed) is for testing the null hypothesis that the parameter value is 0 in the population.

67

68