140 59 2MB
English Pages 110 [121] Year 2011
k
This text will be perfect for all advanced students and researchers in social and personality psychology using structural equation modelling as part of their studies or research. Rick H. Hoyle is Professor of Psychology and Neuroscience at Duke University.
for Social and Personality Psychology
The presentation begins with a discussion of the relationship between structural equation modeling and statistical strategies widely used in social and personality psychology, such as analysis of variance, multiple regression analysis, and factor analysis. This introduction is followed by a nontechnical presentation of the terminology, notation, and steps followed in a typical application of structural equation modeling. The remainder of the volume offers a practically-oriented presentation of specific applications using examples typical of social and personality psychology, and advice for dealing with relevant issues such as missing data, choice of software, and best practices for interpreting and reporting results.
Structural Equation Modeling
This volume offers a nontechnical presentation of structural equation modeling with an emphasis on applications in the fields of social and personality psychology.
k k k
t
Structural Equation Modeling for Social and Personality Psychology Rick H. Hoyle
Hoyle The SAGE Library of Methods in Social and Personality Psychology Series cover design by Wendy Scott
Structural Equation Modeling for Social and Personality Psychology
00-Hoyle-4154-Prelims.indd 1
11/01/2011 12:55:40 PM
The SAGE Library of Methods in Social and Personality Psychology is a new series of books to provide students and researchers in these fields with an understanding of the methods and techniques essential to conducting cutting-edge research. Each volume explains a specific topic and has been written by an active scholar (or scholars) with expertise in that particular methodological domain. Assuming no prior knowledge of the topic, the volumes are clear and accessible for all readers. In each volume, a topic is introduced, applications are discussed, and readers are led step by step through worked examples. In addition, advice about how to interpret and prepare results for publication is presented. The Library should be particularly valuable for advanced students and academics who want to know more about how to use research methods and who want experience-based advice from leading scholars in social and personality psychology. Published titles: Jim Blascovich, Eric J. Vanman, Wendy Berry Mendes, Sally Dickerson, Social Psychophysiology for Social and Personality Psychology R. Michael Furr, Scale Construction and Psychometrics for Social and Personality Psychology Rick H. Hoyle, Structural Equation Modeling for Social and Personality Psychology John B. Nezlek, Multilevel Modeling for Social and Personality Psychology Laurie A. Rudman, Implicit Measures for Social and Personality Psychology Forthcoming titles: John B. Nezlek, Diary Methods for Social and Personality Psychology
The SAGE Library of Methods in Social and Personality Psychology
00-Hoyle-4154-Prelims.indd 2
11/01/2011 12:55:41 PM
Structural Equation Modeling for Social and Personality Psychology Rick H. Hoyle
00-Hoyle-4154-Prelims.indd 3
11/01/2011 12:55:42 PM
© Rick H. Hoyle 2011 First published 2011 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. SAGE Publications Ltd 1 Oliver’s Yard 55 City Road London EC1Y 1SP SAGE Publications Inc. 2455 Teller Road Thousand Oaks, California 91320 SAGE Publications India Pvt Ltd B 1/I 1 Mohan Cooperative Industrial Area Mathura Road New Delhi 110 044 SAGE Publications Asia-Pacific Pte Ltd 33 Pekin Street #02-01 Far East Square Singapore 048763 Library of Congress Control Number: 2010940292 British Library Cataloguing in Publication data A catalogue record for this book is available from the British Library ISBN 978-0-85702-403-9
Typeset by C&M Digitals (P) Ltd, Chennai, India Printed by MPG Books Group, Bodmin, Cornwall Printed on paper from sustainable resources
Cert no. SGS-COC-1565
00-Hoyle-4154-Prelims.indd 4
11/01/2011 12:55:42 PM
Contents
1 Background
1
2 Model Specification
16
3 Estimation and Fit
36
4 Modification, Presentation, and Interpretation
54
5 Modeling Possibilities and Practicalities
75
References
99
Index
00-Hoyle-4154-Prelims.indd 5
107
11/01/2011 12:55:42 PM
00-Hoyle-4154-Prelims.indd 6
11/01/2011 12:55:42 PM
1
Background
This book is intended for graduate students and postgraduate researchers in social and personality psychology who wish to build on a foundation of graduate-level training in analysis of variance and multiple regression analysis and familiarity with factor analysis to learn the basics of structural equation modeling.1 The book offers a nontechnical overview at a level of depth designed to prepare readers to read and evaluate reports of research, and begin planning their own research, using structural equation modeling. This concise, application-oriented treatment is no substitute for coursework and fuller written treatments such as can be found in textbooks (e.g., Bollen, 1989; Kaplan, 2009; Kline, 2010; Schumacker & Lomax, 2004). Rather, it provides a bridge between briefer treatments offered in book chapters and didactic journal articles (e.g., Hoyle, 2007; Weston & Gore, 2006) and those fuller treatments. Structural equation modeling (SEM) is a term used to describe a growing and increasingly general set of statistical methods for modeling data. In light of the statistical training typical of researchers in social and personality psychology, two features of SEM are worth noting at the outset. First, unlike familiar statistical methods such as analysis of variance (ANOVA) and multiple regression analysis, which estimate parameters (e.g., means, regression coefficients) from individual cases, SEM estimates parameters from variances and covariances. Indeed, it is possible to apply most forms of SEM using only the variances of a set of variables and the covariances between them. Second, although this feature is not required and, on occasion, is not used, a significant strength of SEM is the capacity to model relations between latent variables; that is, the unobserved constructs of which observed variables are fallible representations. The focus on covariances rather than data from individual cases involves a move away from familiar estimators such as ordinary least squares toward more general estimators such as maximum likelihood. And the capacity to model relations between latent variables shifts the focus of data analysis from variables to constructs, thereby more closely aligning conceptual and statistical expressions of hypotheses. These departures from the statistical methods traditionally used by social and personality researchers require thinking differently about how data are brought to bear on research questions and hypotheses. The payoff is a comprehensive approach to modeling
01-Hoyle-4154-Ch-01.indd 1
11/01/2011 12:56:14 PM
structural equation modeling for social and personality psychology
data that is well suited for empirical tests of the richly detailed, process-oriented models of the human experience typical of social and personality psychology. In this opening chapter, I lay the groundwork for a nontechnical presentation of the nuts and bolts of SEM in the next three chapters and for promising applications in social and personality psychology in the final chapter. I begin by positioning SEM among the array of statistical methods of which most researchers in social and personality psychology would be aware. I then provide a sketch of the relatively short history of SEM. I next offer working definitions of key concepts with which many readers are not familiar. I conclude the chapter with a section on the use of path diagrams to convey a model to be estimated using SEM.
SEM in Statistical Context One way to make clear the comprehensiveness of SEM and the flexibility it affords for modeling data is to compare it to the various statistical methods historically used in social and personality research. Before making these comparisons, it bears noting that the full set of techniques captured by the term “structural equation modeling” is ever expanding, so that the term now invokes a substantially broader set of techniques than when it came into standard usage in the late 1970s. Indeed, the continuing expansion of SEM capabilities has made the boundaries between SEM and other statistical approaches somewhat hard to define. With that caveat in mind, I can describe SEM in relation to traditional and emerging statistical methods. The names of statistical methods commonly used by researchers in social and personality psychology and selected newer methods are displayed in Figure 1.1. The methods are arrayed from specific to general moving from left to right. An arrow from one method to the next indicates that the former is a limited form of the latter. Put differently, methods to which arrows point could be used to address any statistical hypothesis that could be addressed by methods from which those arrows originate and any methods linked to them. For instance, referring to the top line in the figure, t-test is a limited form of ANOVA because it allows for a single factor with no more than two levels, whereas ANOVA accommodates multiple factors with two or more levels. ANOVA is the specific instance of ordinary least squares (OLS) regression in which all predictors are categorical. And so on. We first encounter a form of SEM, covariance structure modeling, about midway across the figure. As evidenced by the arrows pointing to it, this elemental form of SEM incorporates the essential capabilities of multiple regression analysis and factor analysis. What is not apparent from the figure (and not essential information for the present discussion) is that covariance structure modeling is far more than simply a hybrid of these two well-known statistical strategies. Nevertheless, a useful starting point for researchers in social and personality
2
01-Hoyle-4154-Ch-01.indd 2
11/01/2011 12:56:14 PM
background
random coefficient modeling
t-test for means
bivariate correlation
crosstabulation
ANOVA
OLS regression
covariance structure modeling
factor mixture modeling
factor analysis
logistic regression
growth modeling
growth mixture modeling
SEM: multilevel latent variable modeling with continuous and categorical observed and latent variables
latent class analysis
Figure 1.1 Selected statistical methods in relation to structural equation modeling
psychology is the realization that, in simple terms, SEM can be conceptualized as an extension of multiple regression analysis in which multiple equations (often with multiple, directionally related outcomes) are simultaneously estimated and both predictors and outcomes can be modeled as the equivalent of factors as traditionally modeled in separate analyses using factor analysis. A limitation of this form of SEM is the focus solely on covariances between variables that are assumed to be measured on continuous scales. Examination of other arrows leading to and away from the covariance structure modeling entry in the figure makes clear how SEM is expanded by incorporating means modeling and allowing for categorical variables. The addition of means to variances and covariances in the data matrix allows for the modeling of patterns of means over time in latent growth models (described in Chapter 5) as well as other models analyzed using random coefficient modeling. Analyses made possible by the ability to estimate parameters from categorical data are traced from specific to general along the bottom of Figure 1.1. The transition from logistic regression (and other forms of the generalized linear model) to latent class analysis represents a shift from categories manifest in the observed data to those that are unobserved and inferred from data. Traditional factor analysis and latent class analysis meet in factor mixture modeling, which allows for both continuous and categorical latent variables. The addition of means gives rise to growth mixture modeling, in which heterogeneity in trajectories of means is modeled as latent classes. The array culminates on the right in the long-winded but apt label, multilevel latent variable modeling with continuous and categorical observed and latent variables, which reflects the current breadth of models that can be estimated within the SEM framework.
3
01-Hoyle-4154-Ch-01.indd 3
11/01/2011 12:56:14 PM
structural equation modeling for social and personality psychology
A somewhat arbitrary, primarily semantic, but nonetheless instructive distinction that can be made between the methods shown in Figure 1.1 is between those used to analyze data and those used to model data (Rodgers, 2010). Of course, ANOVA, the prototypic analysis method, can be accurately described as a strategy for modeling sources of variance in individual data; however, ANOVA is typically used in such a way that a relatively standard model is virtually always applied (main effects and interactions). Custom contrasts, trend analyses and the like reflect a move from simply analyzing data from individual cases to modeling means. Factor analysis and latent class analysis reflect the transition from analyzing to modeling data. They suffer, however, from the limitation that, as typically used, models are discovered rather than posited and tested. From covariance structure modeling forward, the methods are best described as approaches to modeling data rather than simply analyzing data. This distinction is elaborated further in the next chapter. A further distinction is between methods that typically are used to analyze or model individual cases and those typically used to model relations between variables. I noted at the outset that SEM differs from statistical methods commonly used by researchers in social and personality psychology in its focus on covariances rather than data from individual cases. To elaborate further, parameter estimation and statistical tests in familiar methods such as ANOVA and multiple regression analysis typically are based on the principle of minimizing the sum of the squared differences between observed scores for individual cases on the dependent variable and the case-level scores predicted by the independent variables. “Error” is defined as the average squared difference between observed and predicted scores across all cases. The goal of estimation in SEM is the minimization of the difference between the observed covariance matrix and the covariance matrix implied by the model. “Error” is defined as the difference between these two matrices as reflected in the value of an estimator-specific fitting function (covered in Chapter 3). In both cases the focus is on the degree to which a model either prescribed by the typical application of the method (as in ANOVA and multiple regression analysis) or specified by the researcher (as in SEM) reproduces the observed data. The distinction is in what constitutes the observed data – case-level scores in ANOVA and multiple regression analysis, variances, and covariances in SEM. I close this section by noting a final pair of observations inspired by Figure 1.1. As established earlier, a statistical method can be used to accomplish any statistical hypothesis test that could be accomplished using methods prior to it in the figure. For instance, two means, which might typically be compared using the t-test, also could be compared using ANOVA, multiple regression analysis, and SEM. Although the use of SEM to compare two means could be defended on statistical grounds, practically speaking it would be unwise. The t-test is perfectly suited to this hypothesis test and requires little explanation or justification.
4
01-Hoyle-4154-Ch-01.indd 4
11/01/2011 12:56:14 PM
background
Following this principle of using the simplest and most straightforward statistical method for hypothesis tests, SEM becomes relevant when the hypothesis is a model that implies multiple equations (i.e., statements of the relations between independent and dependent variables) and/or makes use of a data set that includes multiple indicators of key constructs, allowing for the expression of constructs as latent variables. Although SEM is not always recommended for hypothesis testing in social and personality research, knowledge of the full array of modeling possibilities offered by SEM can inspire predictions and models that might not otherwise have been ventured. Research questions and explanatory models, on occasion, lead to the development of statistical methods for addressing them (e.g., models of intelligence and factor analysis). Typically, however, questions and models are shaped by the statistical methods of which the researcher is aware. As such, the more flexible and general the statistical approach, the broader the range of research questions and explanatory models likely to be ventured. The range of modeling possibilities afforded by SEM suggest new ways for social and personality psychologists to think about social and personality processes, pose research questions, design research, and analyze data.
Historical Roots Most historical accounts of the development of SEM trace its origins to the early 1920s and the development of path analysis by Sewall Wright, an evolutionary biologist. Wright invented the statistical method of path analysis, a graphical model in which the linear relations between variables are expressed in terms of coefficients that are derived from the correlations between them (Wright, 1934). The potential value of Wright’s model for social research was not immediately recognized; it was not until the 1960s that applications of path analysis to social research data were more fully developed. The principal figures in early applications of path analysis to data from social research were sociologists Blalock (1964) and Duncan (1966). Duncan and Goldberger, an econometrician, integrated the sociological approach to path analysis with the simultaneous equations approach in economics (e.g., Goldberger & Duncan, 1973) and the factor analytic approach in psychology (e.g., Duncan, 1975; Goldberger, 1971), yielding the basic form of SEM. This general model was formalized and extended in the 1970s by Jöreskog (1973), Keesling (1972), and Wiley (1973), producing what became known as the LISREL (Linear Structural RELations) model. This model includes two parts, one specifying the relations between indicators and latent variables – the measurement model – and the other specifying the relations between latent variables – the structural model. The LISREL model served as the basis for the LISREL software
5
01-Hoyle-4154-Ch-01.indd 5
11/01/2011 12:56:14 PM
structural equation modeling for social and personality psychology
program, which, by the release of Version 3 in the mid-1970s, allowed substantively oriented social and behavioral researchers to specify, estimate, and test latent variable models using SEM. The earliest uses of SEM in social and personality psychology appeared in the late 1970s and early 1980s. The earliest published applications in social psychology were by Peter Bentler and colleagues. For example, Bentler and Speckart (1979) used SEM to model the relation between attitude and behavior expressed as latent variables, including the first statistical modeling of the full set of relations in the influential theory of reasoned action (Fishbein & Ajzen, 1975). The earliest published uses of SEM in personality research are more difficult to pinpoint; however, by the mid-1980s applications of SEM, particularly the measurement model, to questions regarding personality structure began to appear (e.g., Reuman, Alwin, & Veroff, 1984; Tanaka & Bentler, 1983; Tanaka & Huba, 1984). In a prototypic application, Reuman et al. used SEM to model the achievement motive as a latent variable and show that, when random measurement error is separated from reliable common variance in fallible measures of the construct, validity coefficients are consistent with the theoretical model of the construct. By the late 1980s, spurred by compelling applications such as those by Bentler and Speckart (1979) and Reuman et al. (1984), and better access to software for specifying and estimating models, SEM found traction in social and personality psychology. Its use, and particularly the interpretation of its results, quickly gave rise to a literature on misuses of SEM and misinterpretation of SEM results by psychological scientists (e.g., Baumrind, 1983; Breckler, 1990; Cliff, 1983). The use of SEM in social and personality psychology has improved and increased despite the fact that formal training in SEM in social and personality psychology doctoral programs is still not the norm (Aiken, West, & Millsap, 2008). Extracurricular workshops and didactic volumes such as this one are the means by which many researchers in social and personality psychology learn about the capabilities of SEM and acquire basic proficiency in its use (often as implemented in a specific statistical software package). Although SEM is not likely to join ANOVA and multiple regression analysis as statistical methods that all social and personality researchers must know, its use will no doubt increase as compelling applications are published with increasing frequency.
The Language of SEM Terminology As with any statistical method (though perhaps more so), SEM is characterized by terminology that takes on precise and specific meaning when used with reference to it. Key terms are given full treatment at appropriate points later in the book. Basic definitions, which are offered in this section, will allow me to use the terms
6
01-Hoyle-4154-Ch-01.indd 6
11/01/2011 12:56:14 PM
background
selectively to provide an initial sketch of SEM in the remainder of this chapter and the first part of Chapter 2. Perhaps the most basic term in the SEM lexicon is model, a formal statement about the statistical relations between variables. Models typically are conveyed in diagrams as shown later in this chapter, or as equations as shown in Chapter 2. The origin and evaluation of models in the SEM context vary according to the modeling approach taken (Jöreskog, 1993). In the strictly confirmatory approach, the goal is to evaluate the degree to which a single, a priori model accounts for a set of observed relations between variables. For example, a researcher might evaluate the degree to which self-ratings on a set of adjectives selected to represent four types of affect conform to a model in which each adjective loads on only one of four correlated factors. Alternatively, instead of focusing on a single model, SEM might be used to compare two or more competing models in the alternative models approach. To continue the example, in addition to a model with four correlated factors, the researcher might consider a model with four uncorrelated factors, a model with a single factor, and/or a second-order model in which covariation between the four factors is explained by one superordinate factor. Finally, the use of SEM might involve model generating. If, for example, the researcher’s proposed four-factor model does not adequately explain self-ratings on the adjectives, and there are no obvious alternative models, rather than abandoning the data, the researcher might use them to generate a model. Of course, using the data to generate a model of the data is a questionable practice (MacCallum, Roznowski, & Necowitz, 1992); however, careful modification of an a priori model with the goal of finding a model that accounts for the data can lead to tentative discoveries that ultimately result in amended or revised theoretical models Specification involves designating the variables, relations between variables, and status of the parameters in a model. In terms of designating variables, the decisions are which variables in a data matrix to include as measured variables and which latent variables, if any, to model. In terms of designating the relations between variables, the researcher must decide which variables are related and, for those that are related, whether the relation is nondirectional or directional. Finally, the status of parameters in a model must be specified. Although, strictly speaking, specification is always involved in tests of statistical hypotheses, it is, in most cases, accomplished without the knowledge of the researcher in social or personality psychology. For example, the standard model estimated in applications of ANOVA – all main effects and interactions – typically is specified without consideration for other models that might be fit to the data. In typical applications of multiple regression analysis a specification decision is required in order to position one variable as the outcome and the others as predictors. Perhaps the closest that researchers in social and personality psychology come to specification is in decisions required for hierarchical multiple regression (e.g., how many sets; which variables in which sets; order in which sets enter?) and exploratory factor
7
01-Hoyle-4154-Ch-01.indd 7
11/01/2011 12:56:14 PM
structural equation modeling for social and personality psychology
analysis (e.g., number of factors to extract; method of rotation?). Because there is no standard model to be fitted using SEM, any application requires specification. Detailed coverage is provided in Chapter 2. A key aspect of specification is designating the status of parameters (e.g., variances, covariances, factor loadings, regression coefficients) in a model. Although specification can be quite specific regarding both the magnitude and sign of parameters, parameters typically are specified as either fixed or free. Fixed parameters are not estimated from the data and their value typically is fixed at zero or 1.0. Free parameters are estimated from the data. Because data analytic methods traditionally used in social and personality research do not focus on modeling, readers might not be aware of the fixed and free parameters in applications of those methods. In a hierarchical multiple regression analysis that includes three sets of variables, each comprising two variables, Step 1, which appears to include only two variables, could alternatively be viewed as a model that includes all six variables in which the regression weights for variables in the second and third sets have been fixed at zero. At Step 2, when variables in the second set are added, two of the four formerly fixed parameters (i.e., regression weights) are now free. In the alternative models approach described earlier, the differences between models to be compared typically involve parameters that are free in one model and fixed in the other. In the model generating approach, the adjustment of an initial model in an attempt to better account for the data often involves freeing parameters that were fixed and, to a lesser extent, fixing parameters that were free. The parameters of most interest in models are those associated with paths, which signify directional relations between two variables as in the effect of a predictor on an outcome in multiple regression analysis. The path coefficient indicates the magnitude and strength of the effect of one variable on another. Virtually all models include direct effects, which propose that one variable is temporally or causally antecedent to one other variable. These are types of effects routinely estimated in ANOVA or multiple regression analysis. Within a model, each direct effect characterizes the relation between an independent and a dependent variable, though the dependent variable in one direct effect can be the independent variable in another. Moreover, like multiple regression, a dependent variable can be related to multiple independent variables, and, like multivariate analysis of variance, an independent variable can be related to multiple dependent variables. The capacity to treat a single variable as both a dependent and independent variable lies at the heart of the indirect effect, the effect of an independent variable on a dependent variable through one or more intervening, or mediating, variables (Baron & Kenny, 1986). In the case of a single mediating variable, the mediating variable is a dependent variable with reference to the independent variable but an independent variable with reference to the dependent variable. Thus, the simplest indirect effect involves two direct effects. For instance, if x has a direct effect on z, and z has a direct effect on y, then x is said
8
01-Hoyle-4154-Ch-01.indd 8
11/01/2011 12:56:15 PM
background
to have an indirect effect on y through z. The sum of the direct and indirect effects of an independent variable on a dependent variable is termed the total effect of the independent variable. Effects in models involve one or both of two types of variables. Observed variables (sometimes referred to as manifest variables) are those for which there are values in the case-level data matrix. Virtually all analytic methods traditionally used by researchers in social and personality psychology estimate effects between observed variables. SEM also allows for the estimation of effects involving latent variables, which are implied by a model but are not represented by values in the case-level data matrix. Latent variables, or factors, are a function of observed variables, which, when used to model a latent variable, are referred to as indicators. Indicators are of two types, formative and reflective (Cohen, Cohen, Teresi, Marchi, & Velez, 1990). Formative indicators are presumed to cause their latent variable, which is modeled as a weighted, linear combination of them as in principal components analysis. Reflective indicators are presumed to be caused by their latent variable, which is modeled as an explanation of the commonality among them as in principal factors analysis. Although latent variables can, in principle, be a function of formative indicators, the overwhelming majority of latent variables are a function of the commonality among a set of reflective indicators as with common factors in exploratory factor analysis (Edwards & Bagozzi, 2000; more on this distinction in Chapter 5). Variables, whether observed or latent, can be further distinguished according to whether they are exogenous or endogenous. Exogenous variables (i.e., independent variables) are those for which no explanation is attempted within the model; that is, there are no paths directed toward them. Endogenous variables (i.e., dependent variables) are those to which one or more paths are directed within the model. It is virtually always the case that some portion of the variance in endogenous variables is not explained by paths directed toward them. In such cases, unexplained variance is described in one of two ways depending on how the variable is positioned in the model. In the case of latent variables for which indicators are reflective, the unexplained variance in indicators is attributed to uniquenesses, variance unique to a given indicator in the sense that it is not shared with other indicators of the latent variable. In the case of latent or observed (nonindicator) endogenous variables, variance not accounted for by variables in the model that predict them is allocated to disturbances (equivalent to the error term in a regression equation). As will soon be apparent, uniquenesses and disturbances are latent variables that in some models can be specified as related to other variables in a model. A fully specified model, with its observed and latent variables and fixed and free parameters, implies a structure that is not directly evident in the unstructured set of p(p − 1)/2 covariances (where p is the number of observed variables; this value does not include variances). Although much can be learned from a thorough
9
01-Hoyle-4154-Ch-01.indd 9
11/01/2011 12:56:15 PM
structural equation modeling for social and personality psychology
examination of the covariances (particularly in standardized form as correlation coefficients), the degree to which they are consistent with theory-based models that offer accounts of the relations between the variables can rarely be determined from the data in their most elemental form. A specified model proposes a structure or pattern of statistical relations that is more useful, interesting, and parsimonious than the bivariate associations in the covariance matrix (hence the occasional reference to SEM as covariance structure analysis). As described below and detailed in Chapter 3, the question of model fit can be expressed as how well the covariance structure offered by the model maps onto the unstructured set of covariances. Although it would seem that, research design and logical considerations aside, any arrangement of variables and set of relations between them is possible with SEM, such is not the case. A key consideration when specifying a model is identification, which concerns whether a single, unique value for each and every free parameter can be obtained from the observed data. If for each free parameter a unique estimate can be obtained through one and only one manipulation of the observed data, then the model is just identified and has zero degrees of freedom. If a unique estimate for one or more free parameters can be obtained in multiple ways from the observed data, then the model is overidentified and has degrees of freedom equal to the number of observed variances and covariances minus the number of free parameters. If a single, unique estimate cannot be obtained from the observed data for one or more free parameters, then the model is underidentified and cannot be validly estimated. Thus, a restriction on specification is that the resultant model must be either just identified or overidentified. Although identification is rarely a concern in statistical models traditionally used by social and personality researchers, researchers occasionally stumble on it as a result of the inadvertent inclusion of a continuous variable as a factor in an ANOVA, resulting in a model requiring more degrees of freedom than the N − 1 that are available (i.e., it is underidentified). Identification is covered in greater detail in Chapter 2. A properly specified model can be estimated. Estimation is the statistical process of obtaining estimates of free parameters from observed data. Although single-stage least squares methods such as those used in standard ANOVA or multiple regression analysis can be used to derive parameter estimates, iterative methods such as maximum likelihood or generalized least squares are preferred. Iterative methods involve a series of attempts to obtain estimates of free parameters that imply a covariance matrix like the observed one. The implied covariance matrix is the covariance matrix that would result if values of fixed parameters and estimates of free parameters were substituted into the structural equations, which then were used to derive a covariance matrix. Iteration begins with a set of start values, tentative values of free parameters from which an implied covariance matrix can be computed and compared to the observed covariance matrix. After each iteration, the resultant implied covariance matrix is compared to the observed matrix. The comparison between the implied and observed covariance matrices
10
01-Hoyle-4154-Ch-01.indd 10
11/01/2011 12:56:15 PM
background
results in a residual matrix. The residual matrix contains elements whose values are the differences between corresponding values in the implied and observed matrices. Iteration continues until it is not possible to update the parameter estimates and produce an implied covariance matrix whose elements are any closer in magnitude and direction to the corresponding elements in the observed covariance matrix. Said differently, iteration continues until the values of the elements in the residual matrix cannot be minimized any further. At this point the estimation procedure is said to have converged. A properly converged solution produces the raw materials from which various statistics and indices of fit are constructed. A model is said to fit the observed data to the extent that the covariance matrix it implies is equivalent to the observed covariance matrix (i.e., elements of the residual matrix are near zero). The question of fit is, of course, a statistical one that must take into account features of the data, the model, and the estimation method. For instance, the observed covariance matrix is treated as a population covariance matrix, yet that matrix suffers from sampling error – increasingly so as sample size decreases. Also, the more free parameters in a model, the more likely the model is to fit the data because parameter estimates are derived from the data. Chapter 3 reviews several statistics and indices of fit, highlighting how each accounts for sampling error and lack of parsimony. As described at the beginning of this section, one way in which SEM can be applied is the alternative models approach, which involves comparing models that offer competing accounts of the data. Such models cannot always be formally compared. In some instances two or more alternatives are equivalent models; that is, they produce precisely the same implied covariance matrix and, as a result, identical fit to the data. Ideally, two or more models to be compared are not only not equivalent – they are nested. Two models are nested if they both contain the same parameters but the set of free parameters in one model is a subset of the free parameters in the other. Such models can be formally compared and, on statistical grounds, one chosen over the other. Readers familiar with hierarchical linear regression, in which predictors are entered in sets and statistical significance judged by the R2 increment, already understand the general idea of nested models. One possible outcome of the strictly confirmatory and alternative models approaches to SEM is that the model(s) posited a priori do not provide an acceptable account of the data. In such cases, the researcher can either abandon the analysis or move to a model generating approach. This move entails model modification (or respecification), freeing parameters that in the a priori model(s) were fixed and/or fixing parameters that were free. Such decisions are made through specification searching, which may involve either a diagnosis by the researcher based on evaluation of output (e.g., the residual matrix) or an automated search based on statistical criteria implemented by the software. This exercise may produce a model that appears to offer an acceptable account of the data, but such models always await confirmation using new data.
11
01-Hoyle-4154-Ch-01.indd 11
11/01/2011 12:56:15 PM
structural equation modeling for social and personality psychology
Although not exhaustive, the list of terms defined here is sufficient to begin an exploration of SEM, with the goal of describing and illustrating applications well suited to research in social and personality psychology. This coverage of foundational information about SEM concludes with an overview of path diagrams.
Path Diagrams The models typically tested using methods such as ANOVA and multiple regression analysis have become somewhat standardized. Moreover, the models are straightforward, involving a single dependent variable and a set of independent variables for which linear, and sometimes interactive, effects are estimated. As such, relatively little description or explanation of how these methods are being applied in a given study is required. As will become clear in Chapter 2, there is no standard application of SEM and, for a given set of variables, a potentially large number of models could be specified. This state of affairs makes it necessary for researchers to accurately and completely describe the model(s) to be fitted. An effective means of conveying the details of a model is the path diagram. In all likelihood the path diagram originated with Sewall Wright, who, as mentioned earlier, developed path analysis. The earliest instances appeared in a 1920 article on the relative contribution of heredity and environment to color variation in guinea pigs, which also introduced the terms path and path coefficient. The first instance, although it includes the directional arrows commonplace in path diagrams, also includes sketches of a guinea pig dam and sire as well as two offspring that vary in coloration. Moreover, the symbols for the genetic contributions to the color of offspring are a sperm and an egg! A second figure is both less entertaining and remarkably similar to path diagrams routinely used by sociologists in the 1960s and 1970s. As path analysis has been subsumed by SEM and SEM has expanded, the demands on path diagrams as a means of conveying the details of a model have increased. Indeed, some models are sufficiently complex that the path diagram is no longer an effective means of communicating the details of the model. For most models that would be specified by researchers in social and personality psychology, however, the path diagram is an effective and efficient means of describing models to be estimated using SEM. A path diagram that includes most of the elements typical of path diagrams is shown in Figure 1.2. The general flow of the diagram is from left to right. In this instance, the model begins with F1, which is presumed to arise outside the model (i.e., it is exogenous), and culminates with F3, the construct the model is presumed to explain or, statistically speaking, account for variance in. When possible, the constructs are arrayed according to their presumed position in the model. In this instance, F2 is set between F1 and F3 because, as will soon be apparent, it is hypothesized to mediate the relation between F1 and F3.
12
01-Hoyle-4154-Ch-01.indd 12
11/01/2011 12:56:15 PM
background
* *
d3 *
F1
1
*
*
F3
* *
x1
x2
x3
x4
u1
u2
u3
u4
*
*
*
*
1
*
*
x7
x8
x9
u7
u8
u9
*
*
*
* * d2
F2
1
*
x5
x6
u5
u6
*
*
Figure 1.2 Example of a path diagram
With this general orientation in mind, let us now consider the components of the path diagram. The ovals and circles represent latent variables, sources of influence not measured directly. The ovals correspond to substantive latent variables, or factors. The oval labeled F1 is an independent variable – it is not influenced by other variables in the model. The ovals labeled F2 and F3 are dependent variables – their variance is, in part, accounted for by other variables in the model. Paths run from each of these latent variables to their indicators, represented by squares labeled x1 to x9. These paths are either labeled “1,” which means the factor loading has been fixed at this value (rationale provided in Chapter 2), or “*,” indicating that the factor loading is free and must be estimated from the data. Variance in each indicator not attributable to the latent variable is allocated to measurement error, or uniqueness, indicated by the small circles labeled u1 to u9. Associated with each of these circles is a curved, two-headed arrow and a *, which indicates a variance. The three latent variables are connected by directional arrows. Associated with each is a path coefficient, accompanied by a * indicating the
13
01-Hoyle-4154-Ch-01.indd 13
11/01/2011 12:56:15 PM
structural equation modeling for social and personality psychology
magnitude and direction of influence of one latent variable on another. Small circles also are associated with the endogenous latent variables. These indicate disturbances, variance in the latent variables, labeled d2 and d3, not accounted for by other latent variables in the model. Finally, there is a variance, indicated by *, associated with the latent independent variable. As is true of most models, this model includes a combination of free and fixed parameters. Free parameters are easily identified by the *. The location of fixed parameters is less obvious. It is apparent that there is a single fixed loading on each latent variable. The remaining fixed loadings involve paths that could have been included but were not. For instance, there is no path from F1 to x5. Implicitly, this path has been fixed to zero. Also, there are no covariances between uniquenesses, meaning these parameters are implicitly fixed at zero as well. Fixed parameters in the form of excluded paths are desirable in a model, for they contribute to parsimony and overidentification. They also can explain the inadequacy of a poor-fitting model. Hence, when processing path diagrams, it is important to take note of paths that have been omitted, indicating that the accompanying parameters have been fixed at zero. One additional feature of the model in Figure 1.2 bears mention. Notice that the directional effect of F1 on F3 takes two forms in the model. The path diagram indicates that F1 has a direct effect on F3 as indicated by the horizontal path along the top of the diagram. In addition, the model indicates that F1 has an indirect effect on F3 through F2. That is, F2 serves as an intervening variable, or mediator, through which the effect of F1 on F3 is transmitted. Modeling the variables of primary interest, F1, F2, and F3, as latent variables takes advantage of a key strength of SEM over traditional statistical approaches in social and personality psychology – the capacity to model out unreliability, thereby producing estimates of directional relations that have been corrected for attenuation (Muchinsky, 1996). The advantage of this capacity is aptly illustrated in the model depicted in Figure 1.2. The more unreliable the indicators of the intervening variable, F2, the greater the underestimation of the effect of F2 on F3 and overestimation of the effect of F1 on F2 (Hoyle & Kenny, 1999) if F2 is not modeled as a latent variable. In other words, one might fail to support the prediction that F2 mediates the F1–F3 relation when, in reality, it does (more on this in Chapter 5). Nonetheless, it bears noting that any F could be replaced with an x by creating composite variables from the indicators. Although one would gain a slight advantage over multiple regression analysis because the two equations could be estimated simultaneously, one would lose the important benefits of modeling relations between latent rather than observed variables. Equipped with a basic understanding of the origins of SEM, its relation to statistical models traditionally used by researchers in social and personality psychology, key terminology, and the path diagram, you are now in a position to learn enough about how SEM works and how it can be applied to contemplate using
14
01-Hoyle-4154-Ch-01.indd 14
11/01/2011 12:56:15 PM
background
it in your own work. In the remainder of the book, I offer a nontechnical treatment of key features of SEM, presented in the context of a framework for its application. I then review a representative set of models that could be fruitfully applied to the rich conceptual models typical of social and personality psychology.
Note 1 This book was written during a sabbatical leave generously provided by Duke University and funded in part by grant P30 DA023026 from the National Institute on Drug Abuse. I thank Erin Davisson, Cameron Hopkin, and Kaitlin Toner for providing feedback on a draft of the book. The Series Editor, John Nezlek, and Sage London Editor, Michael Carmichael, provided important input on format and style. As with all personal projects, I owe a debt of gratitude to my wife, Lydia, for covering for me so I could focus long enough to finish.
15
01-Hoyle-4154-Ch-01.indd 15
11/01/2011 12:56:15 PM
2
Model Specification
Although the focus of this chapter is specification of models in SEM, I begin by sketching an implementation framework for applying SEM. The framework sets the stage for material to be covered in this and the next two chapters while making clear how specification relates to and affects subsequent steps in the full implementation of SEM. After describing the implementation framework, I cover in some detail the process of model specification, including initial coverage of identification (which receives additional coverage in Chapter 3). Finally, aided by an example, I illustrate different methods for expressing the specification of a model.
Implementation Framework Although specific implementations of SEM vary from one project to the next, common to all is a set of activities, which are reviewed here and elaborated, beginning with this chapter and concluding with Chapter 4.
Specification As established in Chapter 1, all implementations of SEM begin with the specification of a model. Generally speaking, specification involves designating all variables, observed and latent, that the model will comprise. The relations between these variables are then characterized by a set of relations that vary according to whether they are nondirectional, as in a correlation, or directional, as in a regression equation. Each relation implies a parameter, whose value is either fixed at a specific value or estimated from the data. Each parameter in the model must be identified either by fixing its value or by virtue of a specification that ensure a single, unique value can be obtained through estimation. When the alternative models approach is used, two or more (ideally, nested) models are specified and compared. The bulk of this chapter is devoted to a discussion of how models are specified. Once specified, a model can be estimated.
02-Hoyle-4154-Ch-02.indd 16
11/01/2011 12:56:37 PM
model specification
Estimation As touched on in Chapter 1, the goal of estimation is to obtain values of free parameters in a specified model. The means by which estimates are derived depends on which of a number of possible estimators are used. The default in most SEM computer software is maximum likelihood; however, a number of alternative estimators may be selected and may be more appropriate given characteristics of the data (e.g., scale of measurement, status of observed distributions) and model (e.g., categorical latent variables). With rare exception, values of free parameters are estimated iteratively, meaning that estimation begins with a set of start values (not to be confused with fixed values), which are updated until the difference between the observed covariance matrix and the covariance matrix implied by the model is minimized. The estimation process is further described and illustrated in the first half of Chapter 3. Estimation yields a number of values that are used to generate statistics and descriptive indices use to evaluate model fit.
Evaluation of Fit Unlike statistical methods traditionally used by social and personality researchers, the focus of SEM is goodness of fit; that is, how well does the model, as reflected in the implied covariance matrix, account for the observed data, as reflected in the observed covariance matrix? Consistent with the null hypothesis statistical testing typical of traditional methods, the first fit statistic, in theory distributed as a χ2, purported to test whether the two matrices differed within sampling error. This statistic rather quickly fell into disfavor because (1) it assumes characteristics of the data and model that rarely are met in practice, and (2) it tests a hypothesis of limited interest – that the model perfectly fits the data. The pioneering work of Bentler and Bonett (1980) led to the development of a large number of adjunct fit indices, a subset of which is reviewed in the second half of Chapter 3. Such evaluations of fit often are accompanied by model comparisons, which may provide the basis for choosing one model over another even if both models in an absolute sense provide acceptable accounts of the data. These tests of omnibus fit are followed by tests of individual parameters, which may take the form of absolute tests (typically against zero) or tests in which two parameters within the model or the same parameter for groups whose data have been simultaneously fitted to the same model are compared.
Modification If, by the criteria of choice, a model does not provide an acceptable account of the data, the researcher is faced with two choices. If he/she is taking the strictly confirmatory approach to model evaluation, then the implementation of SEM has
17
02-Hoyle-4154-Ch-02.indd 17
11/01/2011 12:56:37 PM
structural equation modeling for social and personality psychology
concluded with a null finding. If, as is now commonplace, the researcher either began with the express intent of model generation or is now willing to consider the model generating approach, implementation moves forward with modification. Model modification, or respecification, involves using the results from estimating an a priori model or set of alternative models to adjust the model in ways that are likely to improve its fit with the goal of reaching some threshold of acceptable fit. Skilled users of SEM are capable of analyzing residuals from comparing the observed and implied covariance matrices or the pattern of significant and nonsignificant parameter estimates to infer adjustments to a model likely to improve its fit. For example, perhaps much of the covariance between two of four indicators of a latent variables is not explained by the model, suggesting that freeing the covariance between their uniquenesses (typically fixed a priori at zero) would eliminate this portion of the discrepancy between the observed and implied covariance matrices. Less skilled users, or skilled users working with highly complex models, often turn to software-driven specification searches that consider all the ways in which a model could be modified and the improvement in fit that would result from each modification. A presentation of modification strategies and discussion of the potential pitfalls of model modification are provided in the first half of Chapter 4. It is not always possible to find a plausible model that meets established thresholds of fit. In such cases, the implementation concludes with a null finding. Often, however, modification yields an interesting model that offers a satisfying, if tentative, account of the process under study. In such cases, the implementation moves to the final step, interpretation.
Interpretation If, by the criteria of choice, a model offers an acceptable account of a set of observed data, or if modifications to an a priori model have resulted in a model that merits dissemination, then the implementation of SEM moves to interpretation. By interpretation, I mean presentation and discussion of the findings in a manner that is accurate, true to the implementation, and sufficiently detailed that the motivated reader could precisely replicate the findings given the raw data. Many of the early criticisms of the use of SEM in psychological science focused on concerns about interpretation (e.g., Baumrind, 1983; Breckler, 1990; Cliff, 1983; Freedman, 1987). These ranged from ill-advised inferences of causality to unqualified acceptance of specific models when equivalent and contradictory models could not be ruled out. To these, I would add more mundane concerns about how results are presented and the lack of sufficient detail in the description of how SEM was implemented. Additional details about interpretation, including information about presenting the results of an implementation of SEM, are provided in the second half of Chapter 4.
18
02-Hoyle-4154-Ch-02.indd 18
11/01/2011 12:56:37 PM
model specification
The Process of Specification Having provided this general sense of how specification fits into the overall framework for implementing SEM, I now turn to a more detailed discussion of the process of specifying a model, which involves selecting the variables to be included in a model, designating the relations between those variables, and setting the status of the parameters in a model. It should be noted that these steps in the process of specification can be undertaken either at the stage of planning a study, in which case the model dictates which data are collected and how, or after a study has been completed, in which case the range of models that might be specified is constrained by the availability of observed variables in the data set. Allowing model specification to guide data collection ensures that no compromises or dubious assumptions have to be made at the specification, estimation, or interpretation stages of an implementation of SEM.
Selecting the Variables A key consideration when selecting the variables to be included in a model is whether latent variables will be specified. If no latent variables are specified, then all relations are between observed variables for which values are, or will be, in columns in the case-level data matrix. Moreover, in such models, all variables are either independent (i.e., exogenous) or dependent (i.e., endogenous) variables; that is, no observed variables serve as indicators. If latent variables are specified, then observed variables in the model are of three types: independent variables, dependent variables, and indicators. Indicators are not directly involved in directional relations; their influence on other variables is through the latent variable of which they are a fallible representation. It is possible, as in the model presented in path diagram form in Chapter 1, that all observed variables in a model are indicators. In such cases, if there are directional relations in the model, the independent and dependent variables are latent variables. On occasion, the indicators of some of the latent variables in a model are produced through manipulations of the observed variables. Imagine, for example, that one of the independent variables in a model is global attitude toward women, which was (or will be) assessed using a self-report measure that includes 12 items. Although one might be tempted to specify an attitude-toward-women latent variable with 12 indicators, for reasons that will become apparent in the next chapter, the likelihood of that portion of the model fitting well is slim. An alternative approach is to create parcels, unit-weighted composite variables comprising subsets of the indicators. For instance, every third item on the scale could be summed to produce four indicators. Parceling results in fewer, more reliable indicators of latent variables. In large models focusing primarily on the directional relations between latent variables, parceling is a way to reduce the number of free
19
02-Hoyle-4154-Ch-02.indd 19
11/01/2011 12:56:37 PM
structural equation modeling for social and personality psychology
parameters while gaining some of the advantages of modeling constructs as latent variables. In addition to simplifying models by reducing the number of indicators, parceling can be useful as a means of producing suitable indicators when they are not available in the data as observed. For example, as suggested in the book on implicit measurement in the collection of which this book is part (Rudman, 2011, Chapter 5), response latency measures may include many trials, yielding far too many potential indicators of a latent variable. Parceling allows the researcher to reduce the number of indicators to a manageable number. Parceling also is a strategy for moving from dichotomous variables (e.g., yes/no, present/absent) and the estimation challenges they pose, to polytomous variables that are better suited for standard estimators such as maximum likelihood. The discussion of parceling raises the question: How many indicators of a latent variable are optimal when the data set and model allow for discretion? For reasons detailed below, two indicators are sufficient to satisfy estimation rules only under certain conditions. As such, the specification of latent variables with two indicators is to be avoided when possible. For a “free-standing” latent variable (i.e., one that is not related to other variables in a model), three indicators are sufficient for estimation but not model testing. In other words, it is not possible to reject a latent variable with three indicators on the basis of model fit. For these reasons, four is the recommended minimum number of indicators per latent variable in a model. In theory, there is no maximum number of indicators that could be specified per latent variable, but, as suggested earlier, each added indicator increases the likelihood of inadequate fit. Why is this the case? Recall that, in the case of reflective indicators, the latent variable reflects variance common to all of its indicators. As the number of indicators increases, the likelihood increases that two or more of them share variance that they do not share with the remaining indicators. So, does this mean fewer is always to be preferred when the number of indicators per latent variable is flexible? Although parsimony and model fit are served by fewer indicators, surprisingly, increasing the number of indicators per latent variable increases the likelihood of successful estimation when sample size is small (Marsh, Hau, Balla, & Grayson, 1998). These considerations apply only when the primary focus of a model is the relations between latent variables, not when the focus is the quality of a set of indicators of a latent variable as might be the case in psychometric evaluations of a multi-item scale. In such cases, each item, regardless of the number, should be specified as an indicator of the latent variable it is purported to measure. Because of the limitations of traditional methods of data analysis such as ANOVA and multiple regression analysis, researchers in social and personality psychology are accustomed to combining indicators into observed composite scores on which analyses are based. Because, as established in Chapter 1, these methods are special cases of SEM, the same could be done in implementations of SEM. As such, given the potential complications that arise from modeling
20
02-Hoyle-4154-Ch-02.indd 20
11/01/2011 12:56:37 PM
model specification
constructs as latent variables, what are the benefits compared to observed composite variables? The primary benefit of estimating relations between latent as opposed to observed variables is that coefficients indexing the strength of association are corrected for attenuation (i.e., underestimation) attributable to unreliability. Such coefficients are assumed to be closer to their population values and, because they are likely to be larger, are more likely to attain statistical significance. Furthermore, because paths in a model are interdependent, larger coefficients in one part of the model may result in smaller coefficients in another part of the model as in the case of mediation analysis (more on this dynamic in Chapter 5). Although this is a significant benefit of modeling constructs as latent variables, researchers should be careful not to overstate the extent of the benefit. It is fashionable to state that latent variables are corrected for measurement error; however, they are only corrected for sources of error that vary across their indicators (DeShon, 1998). So, for example, when all indicators of a latent variable are measured by self-report, any bias attributable to self-reporting is shared by all the indicators and therefore reflected in the latent variable, not removed from it. In some cases, the specification of latent variables provides a means of reducing the number of predictors or (more typically) outcomes (Cole, Maxwell, Arvey, & Salas, 1993). In the most straightforward case, the researcher has access to multiple measures of key constructs (e.g., Beck Depression Inventory and CES-D) but no compelling reason to model each as a separate latent variable. A single latent variable (e.g., dysphoria) is likely to be more reliable than latent variables of each measure and moves the focus away from particular measures of the construct to the construct itself. Alternatively, the researcher might have access to measures of different constructs that reflect an abstract, superordinate construct such as mental health or positive emotionality. The focus of the model might be the superordinate construct, or the researcher might wish to estimate relations with the superordinate construct before or in addition to relations with the different facets for which measures were administered. When data to be analyzed using SEM were not collected with SEM in mind, it is not uncommon for one or more key constructs to be represented in the data set by a single observed variable. In such cases, it is not possible to model a latent variable that captures the commonality in a set of indicators. Nonetheless, a latent variable can be specified by using an estimate of the reliability of the single variable in order to fix its uniqueness when modeled as an indicator of a latent variable. The logic of this specification is illustrated in Figure 2.1. Because the variable is represented in the data by a single element in the covariance matrix, its variance, the specification can include only one free parameter. To the left is a path diagram showing vx as the lone indicator of F1. Because the uniqueness, ex, is fixed to zero and the loading fixed to 1.0, the latent and observed variables are one and the same. The variance of the latent variable is the variance of the observed variable in the data. To the right is a path diagram that, again shows
21
02-Hoyle-4154-Ch-02.indd 21
11/01/2011 12:56:37 PM
structural equation modeling for social and personality psychology
*
*
F1
F1
1
1
vx
vx
ex
ex
0
.3
Figure 2.1 Single indicator latent variables specified without measurement error (left) and with fixed measurement error (right)
the loading fixed to one, but now shows ex fixed to a nonzero value. This value is computed from the reliability of the variable and its variance in the observed data using the formula ex = (1 − rxx ) sx2 . The particular value in the figure derives from a hypothetical example in which the reliability of the indicator, rxx, is .80 and its variance is 1.5. The variance in F1 is now reduced to an estimate that would be expected if F1 were modeled as a latent variable using a set of indicators with a reliability of .80. Relevant to the discussion of selecting observed and latent variables is a distinction that often is made between components of a model. The measurement component, sometimes referred to as the measurement model, comprises those aspects of the model that concern the relations between indicators and latent variables. A model need not include a measurement component as in the case in which all variables are observed (e.g., path analysis). At the other extreme, a model may include only the measurement component as in models that are primarily concerned with the relations between indicators and latent variables (e.g., confirmatory factor analysis). The structural component, or structural model, comprises those aspects of the model that concern the relation between independent and dependent variables, observed or latent. In models that include both measurement and structural components, it is not uncommon for the analysis to begin with a focus strictly on the measurement component. Only after the adequacy of the measurement component is established is the structural component added and paths between latent variables estimated (Anderson & Gerbing, 1988).
22
02-Hoyle-4154-Ch-02.indd 22
11/01/2011 12:56:38 PM
model specification
Designating Relations between Variables When the variables to be included in a model have been selected, the next step in specification is to designate the relations between them. The types of relations are quite straightforward – nondirectional and directional – but the justification for them, particularly directional relations, is not so straightforward, particularly when, as in many survey studies, no variables are manipulated and the observed data are gathered at one point in time. Moreover, certain relations that might otherwise be overlooked because they are not of substantive interest must be specified in order to account for features of the research design. The most rudimentary relation between two variables is the nondirectional relation, which is estimated and tested in SEM as a covariance. Because nondirectional relations imply no causal or temporal precedence of one variable over the other, they generally require little justification. Yet, despite their relative simplicity the specification of nondirectional relations is not always straightforward. For instance, certain nondirectional relations that are implicit in applications to which social and personality researchers are accustomed must be explicitly specified in SEM. When a model includes more than one exogenous variable, the covariances between these variables should be estimated. In multiple regression analysis, in which predictors are observed exogenous variables, those covariances are specified implicitly. Or, when a latent variable is modeled with the same indicators on multiple occasions, the uniquenesses associated with instances of the same indicator should be specified as covarying. This specification reflects the fact that any specificity in an indicator at one point in time is likely present at other points in time (example provided in Chapter 5). Importantly, omission of covariances in these two situations, which is the equivalent of fixing them at zero, would virtually ensure a poor fit to the data and perhaps raise doubts about the relations of interest in the model when the culprit is failure to accurately reflect the design of the research. Nondirectional relations should be specified when dictated by the research design and when the direction of the relation between two variables is either not of interest or impossible to ascertain given the design of the study. The nature of directional relations between variables in SEM has been at the heart of criticism and debate since published reports of SEM began appearing in social and personality psychology journals (e.g., Baumrind, 1983). The criticism stems largely from misconceptions by early users and readers, perhaps swayed by the causal language favored by sociologists at the time (e.g., Blalock, 1961), that SEM could be used to test for causality (hence, directionality) in ways that other statistical methods could not. The vivid portrayal of directional relations in path diagrams also suggests a temporal or causal ordering that may or may not be reflected in the researcher’s interpretation of the model. In this regard, it is instructive to compare path diagrams for ANOVA, multiple regression analysis, and SEM. On the left side of Figure 2.2 is a path diagram representation of a two-way ANOVA. The two factors, a and b, are observed and might result from manipulations,
23
02-Hoyle-4154-Ch-02.indd 23
11/01/2011 12:56:38 PM
structural equation modeling for social and personality psychology
*
*
a *
*
b
*
*
*
x1
* *
e y
* *
*
*
*
x2
*
*
*
e y
x3
ab
Figure 2.2 Path diagrams of a two-way ANOVA (left) and a multiple regression analysis (right)
pre-existing categories (e.g., gender), or categorization on individual difference variables. It is not ordinarily apparent in an ANOVA, but their interaction, ab, also is an observed variable. Note that, in this example, the factors are presumed to be orthogonal because there are no covariances between the factors. On the right side of Figure 2.2 is a path diagram representation of a multiple regression analysis. The predictors, x1, x2, and x3, may be nominal, ordered categorical, or continuous. They are assumed to be correlated, as reflected in the nondirectional paths accompanied by free parameters between each one. In both models, the * on the directional paths capture the effect of the independent variables on the dependent variable; tests of those parameters against zero would be the equivalent of tests of the effects in ANOVA or multiple regression analysis, respectively. The striking similarity between the two path diagrams makes clear the relation between ANOVA and multiple regression analysis. The diagrams also make clear how either statistical model could be cast in SEM terms. In addition to further highlighting the relation between ANOVA and multiple regression analysis and both of these methods and SEM, the path diagrams illustrate that the estimation of directional paths in SEM is fundamentally the same as testing effects in these less controversial models. In the same way that the direction of the effect of a on y in the ANOVA model often is in doubt when a is an individual difference, or the direction of the effect of x3 on y is typically uncertain when both are assessed in a single survey administration, the status of directional relations in SEM often is uncertain. Yet, in the same way that few researchers would abandon ANOVA and multiple regression analysis for bivariate correlation analysis because the direction of relations between independent and dependent variables is uncertain, researchers should not forgo the specification of models with directional relations when they correspond to effects of theoretical interest. As with ANOVA or multiple regression analysis, the potential for misuse arises at the point of interpreting significant effects. Implying that a causes y or x3 causes
24
02-Hoyle-4154-Ch-02.indd 24
11/01/2011 12:56:38 PM
model specification
y based on significant results would be incorrect unless the research design supports such an inference. When such support is not available, regardless of whether the effects were estimated using ANOVA, multiple regression analysis, or SEM, the correct inference is that the findings are consistent with a directional effect but do not provide a definitive test of it. It should be noted that, in cross-sectional data, the questions of whether a relation between two variables is directional and which way the direction runs cannot be settled through model testing or model comparison using SEM. Indeed, in many cases, the fit of a model with a directional path between two variables will be no different than the fit of a model with a nondirectional path between them. Such equivalent models pose an inferential conundrum, which I address more fully in Chapter 3. In short, SEM, like other statistical methods used in social and personality research, cannot be used alone to support the conclusion that the relation between two variables is directional.
Setting the Status of Parameters After variables have been selected and the relations between them designated, the specification of a model is completed by setting the status of parameters. As noted earlier, parameters in a model can be either fixed or free. Free parameters are estimated from the data. The total number of free parameters in a model cannot exceed p(p + 1)/2, where p is the number of observed variables (this value includes covariances and variances). For reasons of identification (discussed below) certain combinations of free parameters are not permissible. Free parameters in a model generally fall into two categories: those of substantive interest to the researcher, such as factor loadings or structural path coefficients; and those of little or no substantive interest but necessary to account for the characteristics of the research design (e.g., autocorrelated uniquenesses in longitudinal models with latent variables) or that are simply a consequence of the model specification (e.g., uniquenesses, disturbances). The former are familiar to social and personality researchers, but the latter typically are estimated as a matter of course and in such a way that researchers are not aware they have been estimated. Researchers in social and personality psychology rarely have occasion to fix parameters. In addition to hierarchical multiple regression, discussed in Chapter 1, an example is orthogonal rotation in exploratory factor analysis, which, in effect, involves fixing the covariances between factors to zero. In the specification of models in SEM, setting a subset of the parameters in a model to specific values before estimation is always necessary. In virtually every case, the value to which fixed parameters are set is zero; in one instance, certain parameters in measurement models are set to one. There are three reasons a given parameter may be fixed: (1) it is necessary for the parameter, and therefore the model, to be identified; (2) it is
25
02-Hoyle-4154-Ch-02.indd 25
11/01/2011 12:56:38 PM
structural equation modeling for social and personality psychology
the convention for models of the type being specified; and (3) it allows for the test of a substantive hypothesis. In order for the free parameters in a model to be estimated, tested, and interpreted, every parameter in the model must be identified. That is, there must be one, and only one, value for each parameter given the observed data, model specification, and estimator. Identification is straightforward for fixed parameters, the values of which have been identified by the researcher based on convention, characteristics of the research design, or, in rare cases, prior knowledge regarding the parameter. One convention that will be evident in any model that includes latent variables is the fixing of a single parameter for each latent variable. The purpose of this fixed parameter is the identification of the variance of the latent variable, which otherwise is not identified. The most straightforward means of identifying the variances of latent variables is fixing them directly (typically to a value of 1.0). Although this strategy works well for exogenous latent variables, it is not easily accomplished for endogenous latent variables, whose variances are a function of paths leading to them and their disturbances (e.g., F2 and F3 in Figure 1.2). For endogenous latent variables (typically for exogenous latent variable as well), variances are identified by fixing one loading to 1.0, which, in effect, fixes the variance of the latent variable to the common portion of the variance of that indicator (Steiger, 2002). Other conventions and design-based reasons for fixing parameters are discussed in Chapter 5. Establishing the identification of free parameters is more challenging. To do so definitively requires demonstrating that a unique value for each free parameter can be obtained through one or more algebraic manipulations of the observed data given the model. Parameters for which a unique value can be obtained through a single algebraic manipulation are identified; those for which the value can be obtained through more than one manipulation of the observed data are overidentified. Free parameters for which a unique value cannot be obtained are unidentified. The algebraic manipulations by which parameter estimates are produced in SEM are complex and not essential to the nontechnical, application-oriented presentation offered in this book. Nonetheless, a simple equation for which algebraic manipulations are straightforward will illustrate the identification issue. Imagine an equation with two unknowns, x and y, and one known value, 3: x − y = 3.
Given only this equation, x and y are unidentified: An infinite number of values for x and y could satisfy the conditions of the equation. Imagine now that parameter x is fixed at a value of 6. The addition of this information renders both x and y identified: x is identified because the value has been supplied, and y is identified because, given the fixed value of x, its value must be 3.
26
02-Hoyle-4154-Ch-02.indd 26
11/01/2011 12:56:38 PM
model specification
Attention to the identification status of individual parameters in a model is a concern for local identification. Ultimately, however, the full set of free parameters in a model must be estimated. As a result, the concern is more typically for global identification – the identification status of the model as a whole. If all parameters in a model are identified, the model as a whole is identified and can be estimated. If one or more parameters is unidentified, then the model as a whole is unidentified and cannot be estimated. Identified models are of two types. Just identified models are those in which a unique value for each parameter can be obtained through one algebraic manipulation of the observed data given the model. Although the values of free parameters can be tested and interpreted in just identified models, such models are of minimal interest because they cannot be tested. Just identified models always result in an implied covariance matrix that exactly matches the observed covariance matrix. Overidentified models are those in which a unique value for one or more parameters can be obtained through two or more algebraic manipulations of the observed data given the model (illustrated in Chapter 3). Such models yield positive degrees of freedom and an implied covariance matrix that virtually always differs from the observed covariance matrix. These characteristics allow for tests of goodness of fit – the magnitude of difference between the observed and implied covariance matrices given the number of ways in which they might differ as reflected in degrees of freedom. Definitively determining the global identification status of a specified model requires a level of mathematical sophistication and a grasp of SEM well beyond what is typical for social and personality researchers. The only sufficient condition for establishing that a model is identified is evidence that a unique value for each free parameter could be obtained as a function of other parameters in the model given the data. Given the challenges involved in producing this evidence, the identification status of a model typically is evaluated against a set of necessary but not sufficient conditions. Although some of those conditions also are technical in nature and beyond the reach of most applied researchers, some are straightforward and rather easily established. For example, using methods described earlier, the variance of each latent variable must be identified by fixing a single parameter. Another condition concerns the number of indicators per latent variable. If a latent variable is modeled as uncorrelated with other variables in a model (e.g., a one-factor model or a model with orthogonal factors), it must have at least three indicators in order to be identified. This is an instance of the general identification rule that a model cannot have more free parameters than observations, with observations corresponding to the p(p + 1)/2 elements of the observed covariance matrix. A tempting but potentially misleading means of establishing global model identification is to rely on features in the computer program used for estimation to detect underidentification. Relevant error messages typically are generated based on a technical evaluation of the suitability of the parameter estimates. Although these error messages virtually always signal a problem with the model,
27
02-Hoyle-4154-Ch-02.indd 27
11/01/2011 12:56:38 PM
structural equation modeling for social and personality psychology
they rarely provide enough information to properly diagnose it. For instance, although a model may be, in the algebraic sense, identified, it may be empirically underidentified due to multicollinearity or essentially zero relations at one or more places in the model. Other aspects of the results (e.g., negative variances, very large standard errors) often serve to pinpoint the problematic parameters. Beyond concerns about identification, fixing the values of some parameters in a model is dictated by convention. For instance, in the absence of a compelling basis for doing otherwise, the covariances between uniquenesses are fixed at zero. Although simple-structure measurement models, those in which each indicator loads on only one latent variable, are not required, it is rare to see measurement models in which the loadings of indicators on additional factors are not fixed at zero. In panel models, the covariances between disturbances of concurrent endogenous variables typically are fixed at zero. Although the convention is for each of these parameter types to be fixed, only identification concerns prevent them from being freed. Indeed, post hoc modifications often involve freeing a subset of these parameters in order to better account for the observed data. Parameters in a model may also be fixed as a means of testing substantive hypotheses about relations within the model. As an example, consider a model with three latent variables, each indicated by three of nine observed variables. If the variances of the latent variables are identified by fixing their values at 1.0 (rather than fixing one of their loadings to 1.0), then the covariances between the latent variables are, in effect, correlations (comparable to standardizing a pair of variables, then computing their covariance). As such, the values of the three covariances could simultaneously be fixed at 1.0, producing a nested model and a statistical means of assessing the comparative fit of one- and three-factor models. Alternatively, regardless of how the variances of the latent variables are identified, the covariances could be fixed at zero, permitting a test of whether the three factors are orthogonal or correlated. Parameters may also be fixed indirectly through the use of equality constraints. Equality constraints require that, during the search for the optimal set of estimates for free parameters, those parameters constrained to be equal must have the same value. In some instances, equality constraints are used to model assumptions about certain relations in a model, as in the stationarity assumption (i.e., assumed equivalence of parallel parameters between each pair of waves) in some panel models. In other instances, equality constraints are used strategically to test hypotheses of interest. For example, in cross-lagged panel models (e.g., two variables measured on two occasions with each at Time 1 influencing the other at Time 2), a model in which concurrent cross-lagged paths are constrained to be equal can be compared to a model in which they are free to vary as a means of testing whether one variable has causal (or temporal) priority over the other. Equality constraints are particularly useful in multigroup models, in which the values of selected free parameters in a model are compared by simultaneously
28
02-Hoyle-4154-Ch-02.indd 28
11/01/2011 12:56:38 PM
model specification
estimating a model for two or more groups. These and other strategic applications of equality constraints are covered in Chapter 5.
How Models are Specified Having now established the what of specification, I use an example to illustrate the how of specification. The how of specification refers to the expression of a model and its variables, relations, and parameters in a form suitable for communication and, ultimately, estimation. In the context of an example, I illustrate three approaches to expressing the specification of a model. To the path diagram, which I covered in the abstract in Chapter 1, I add equations and matrices.
Specifications in Path Diagram Form For many researchers, the path diagram is the most intuitive approach to expressing the specification of a model. The use of path diagrams in this way is illustrated in Figure 2.3. Although, in the abstract, the model is the same as the one used to illustrate path diagrams in Chapter 1, subtle changes to the labeling make it more suitable for our purposes here. First, notice that the latent variables, F1, F2, and F3, have been labeled to indicate the constructs they represent. The model is based loosely on preliminary findings indicating that individuals high in agreeableness are more persuaded by strong arguments because they are, in general, more responsive to communications from other people than individuals low in agreeableness (Habashi & Wegener, 2008). Second, notice that the observed variables are now labeled with v and the uniquenesses with e. These labels allow for straightforward translation of the path diagram to equations in the format presented in the next section. Path diagrams offer a number of advantages for expressing model specification. Most fundamental is making clear the position of all variables in the model. For instance, it is clear from Figure 2.3 that responsiveness mediates a relation in which agreeableness is the putative cause and persuasion of the effect. Because path diagrams make clear the place of each variable in a model, they map well onto conceptual expressions of models. Indeed, the same figure could be used to introduce a model at the conceptual stage of development and make explicit the way in which the model was specified prior to estimation. The inclusion of symbols for free and nonzero fixed parameters allows for easy accounting for parameters. And, following estimation, the * can be replaced by parameter estimates as a means of presenting results. A significant disadvantage of path diagrams for expressing model specification is that parameters fixed at zero are not shown explicitly in the diagram. Parameters that fit this description are of two types in the model shown in Figure 2.3: loadings of indicators on the two latent variables to which they are not assigned and covariances between uniquenesses. Although the appropriate paths with accompanying zeros could be
29
02-Hoyle-4154-Ch-02.indd 29
11/01/2011 12:56:38 PM
structural equation modeling for social and personality psychology
* *
d3
F1
1
*
*
F3
*
Agreeableness
Persuasion
* *
v1
v2
v3
v4
e1
e2
e3
e4
*
*
*
*
1
*
*
v7
v8
v9
e7
e8
e9
*
*
*
* * d2 F2 Responsiveness
1
*
v5
v6
e5
e6
*
*
Figure 2.3 Example model specification expressed in path diagram form
added to the diagram, doing so would significantly decrease its effectiveness due to clutter. Because parameters fixed at zero typically are not shown, path diagrams do not provide a full expression of model specification.
Specifications in Equation Form An alternative approach to expressing the specification of a model is using a notation system developed by Bentler and Weeks (1980) and incorporated into the EQS computer program. Referring back to Figure 2.3, in which the labeling conforms to the Bentler–Weeks system, you can see that observed variables are labeled v, latent variables are labeled F, uniquenesses are labeled e, and disturbances are labeled d. Each is numbered arbitrarily, typically following the implicit causal flow of the model. These labels are used to construct two types of equations: measurement equations, which reflect the relations between indicators and latent variables, and structural equations, which reflect directional relations between variables.
30
02-Hoyle-4154-Ch-02.indd 30
11/01/2011 12:56:39 PM
model specification
Our example model implies nine measurement equations, expressed as follows: v1 = 1F1 + e1 v2 = *F1 + e2 v3 = *F1 + e3 v4 = *F1 + e4 v5 = 1F2 + e5 v6 = *F2 + e6 v7 = 1F3 + e7 v8 = *F3 + e8 v9 = *F3 + e9.
Although this model expression is compact, it suffers from the same limitation as path diagrams: it does not make explicit those parameters fixed at zero. The equations could be expanded to overcome that limitation: v1 = 1F1 + 0F2 + 0F3 + e1 v2 = *F1 + 0F2 + 0F3 + e2 v3 = *F1 + 0F2 + 0F3 + e3 v4 = *F1 + 0F2 + 0F3 + e4 v5 = 0F1 + 1F2 + 0F3 + e5 v6 = 0F1 + *F2 + 0F3 + e6 v7 = 0F1 + 0F2 + 1F3 + e7 v8 = 0F1 + 0F2 + *F3 + e8 v9 = 0F1 + 0F2 + *F3 + e9.
Expressed in this way, it is apparent that, of the 27 possible loadings, 21 are fixed (3 to identify the latent variable variances and 18 to exclude secondary loadings) and 6 are free. A pair of structural equations indicate the directional relations between latent variables: F2 = *F1 + d2 F3 = *F1 + *F2 + d3.
In this model, the full set of relations between latent variables is specified, and therefore there are no zeros in the structural equations. For the purpose of illustration, imagine that the researcher predicted that the full effect of agreeableness on persuasion was transmitted through responsiveness. In that case, the F1-F3 path
31
02-Hoyle-4154-Ch-02.indd 31
11/01/2011 12:56:39 PM
structural equation modeling for social and personality psychology
would be dropped from the diagram, resulting in a zero next to F1 in the second structural equation. The measurement and structural equations specify all directional relations in the model but do not specify the other parameters. Covariances, variances, and any parameters that are constrained in some way prior to estimation are represented in the Bentler–Weeks system using double-label notation. The name comes from the fact that any parameter can be designated by a pair of references to variables in the model. For variances, only a single variable is involved, resulting in the following labels for the model in Figure 2.3: F1,F1 = * e1,e1 = * e2,e2 = * e3,e3 = * e4,e4 = * e5,e5 = * e6,e6 = * e7,e7 = * e8,e8 = * e9,e9 = *.
The model includes no covariances, but for purposes of illustration, I return to the hypothetical model in which F2 is exogenous. In that model, the covariance between F1 and F2 would be designated as F1,F2 = *.
Two other aspects of specification not represented in the model bear mention. All of the variances are free as indicated by the *. Imagine that we identified the variance of F1 by fixing its value to 1.0 rather than fixing the loading of v1 on F1 to 1.0. In addition to replacing the 1 with an * in the first measurement equation, we would replace the first variance designation with F1,F1 = 1.
Finally, imagine we wanted to test the hypothesis that the loadings for v2 and v3 on F1 are equal. Using the Bentler–Weeks notation, we would indicate this feature of the specification as v2,F1 = v3,F1.
32
02-Hoyle-4154-Ch-02.indd 32
11/01/2011 12:56:39 PM
model specification
Specifications in Matrix Form A third approach to communicating the specification of a model is using matrix notation. Sometimes referred to as LISREL notation because of its use in the LISREL computer program, matrix notation is used widely across the social sciences to communicate about SEM. An advantage of matrix notation is its widespread use in textbooks and the technical literature on SEM. Researchers who wish to keep abreast of the latest developments in SEM will be hard pressed to do so without familiarity with matrix notation. Another advantage is that matrix notation typically makes explicit all parameters in a model, including those fixed at zero. A key disadvantage of matrix notation for many researchers in social and personality psychology is that it assumes at least basic knowledge of linear algebra and makes use of symbols that are either not used in more familiar statistical models or used to denote different parameters in the SEM context than they are used to denote in those models. I now illustrate matrix notation by using it to specify the model in Figure 2.3. As with the Bentler–Weeks system, a distinction is made between the measurement and structural components of the model. Within the measurement model, a further distinction is made between exogenous and endogenous latent variables. Exogenous latent variables are labeled ξ, their indicators as x, and uniquenesses as δ. Endogenous latent variables are labeled η, their indicators as y, and uniquenesses as ε. For both types of latent variables, loadings are labeled λ. The system is referred to as matrix notation because parameters are arrayed in matrices that vary in dimension according to the number of indicators and latent variables. The loadings are arrayed in Λx and Λy for exogenous and endogenous variables, respectively. Uniquenesses are arrayed in Θδ and Θε. The variances of exogenous latent variables, and the covariances between them if there is more than one, are included in Φ. Variances of endogenous latent variables are not estimated. Variance unaccounted for in endogenous latent variables is allocated to disturbances, of which the variances and, in the event there is more than one, the covariances are arrayed in Ψ, which is part of the structural model. Using these labels and matrices, the measurement component of the model in Figure 2.3 is specified as follows. The relations between the exogenous latent variable and its indicators are expressed as x1 1 δ1 x λ ξ + = [ ] 2 21 1 δ 2 x3 λ31 δ 3
or, more compactly (the bold typeface indicates a matrix), x = λxξ + δ.
33
02-Hoyle-4154-Ch-02.indd 33
11/01/2011 12:56:39 PM
structural equation modeling for social and personality psychology
Because there is only one exogenous latent variables, the Φ matrix includes a single element corresponding to its variance:
[ϕ11 ] The relations between the endogenous latent variables and their indicators are expressed as 0 y1 1 ε1 y2 λ21 0 η ε 2 y3 = λ31 1 1 + ε 3 η2 y4 0 λ42 ε 4 y 0 λ ε 52 5 5
or, more compactly, y = λyη + ε.
Parameters in the structural model are specified in two matrices: Γ, which includes coefficients on paths from exogenous to endogenous latent variables, and Β, which includes coefficients for paths between endogenous latent variables. Disturbances are labeled ζ. Using these labels and matrices, the structural relations in our example are specified in matrix notation as η1 0 0 η1 γ 11 ζ1 η = β η + γ [ξ1 ] + ζ 0 2 21 2 21 2
or, more compactly, η = Bη + Γξ + ζ.
The variances of the disturbances associated with the two endogenous latent variables in the model and their covariance (which is fixed at zero) are specified as ψ 11 0 ψ . 22
A virtue of matrix notation is that the status of every parameter in the model is made explicit. Whether the specification of a model is expressed in diagram, equation, or matrix form is a matter of taste, norms governing the literature to which one contributes, and the computer program one uses to estimate the model. Each approach requires acknowledging and making a decision about the status of every
34
02-Hoyle-4154-Ch-02.indd 34
11/01/2011 12:56:41 PM
model specification
parameter in the model with a concern for the identification of the model as a whole and each parameter in it. Most social and personality researchers are not accustomed to engaging the statistical model they are using at this level. Although an initial investment in mastering the material covered in this chapter is required, the payoff is access to an exceptionally flexible method for modeling data. Fortunately, when a model has been fully specified, the remaining steps in an implementation of SEM are similar to steps in the application of more familiar statistical methods.
35
02-Hoyle-4154-Ch-02.indd 35
11/01/2011 12:56:41 PM
3
Estimation and Fit
A model that has been properly specified can be estimated and its fit to the data evaluated. Because the goal of estimation is to maximize fit, and the various statistics and indices used to judge fit are based on information generated during estimation, I cover the two topics in a single chapter. In the first part of the chapter, I discuss the basic logic of estimation, focusing specifically on estimation using the method of maximum likelihood. Following relatively brief coverage of alternative estimation methods, I offer a selective review of methods of indexing fit. In that section, I touch on the complex issue of statistical power in SEM. The chapter concludes with a succinct review of computer programs for estimating and testing the fit of models in SEM.
Estimation Estimation in SEM is necessitated by overidentification. This point can be illustrated using simple equations. Returning to the example given in Chapter 2, imagine a single equation with two unknowns, for which values are to be estimated: x − y = 3.
This model is unidentified because unique values for x and y cannot be determined given the information in the equation. You will recall that I showed how fixing x to a value of 6 resulted in the identification of both x and y. When x is fixed, the resulting equation includes only one unknown and, with one known value and one unknown, this simple model is just identified. The precise value of y is rather easily determined; indeed, it is not an estimate. Moreover, the values of x and y can be used to reproduce the observed data, 3, exactly. An alternative means of identifying x and y is the addition of a second equation. Suppose we added the following equation to our model: 2x − y = 9.
Although we still have two unknowns, x and y, we have two known values, 3 and 9. Again, our model is just identified, as evidenced by the fact that only one set of
03-Hoyle-4154-Ch-03.indd 36
11/01/2011 12:57:37 PM
estimation and fit
values for x and y satisfies the equations, x = 6 and y = 3, and it perfectly reproduces the observed data. Imagine now that we add a third equation to the model: 2x − 3y = 5.
We now have an overidentified model; three known values, 3, 9, and 5, but only two unknowns, x and y. It is no longer possible to find values of x and y that perfectly reproduce the observed data. Our only option now is to find values of x and y that yield outcomes as close as possible to the observed outcomes. “As close as possible” may be defined in different ways. For instance, using a criterion familiar to social and personality researchers, we might seek values of x and y that minimize the mean of the squared differences between the observed and predicted data. This criterion yields values of 5.25 and 1.81 for x and y, respectively, which result in values of 3.44, 8.69, and 5.07 for the three equations. The discrepancy between the set of predicted values derived from the estimates and the set of observed values is an indication of how well our model, expressed in the three equations, accounts for, or reproduces, the observed data. Expressed more formally, the goal of estimation in the case of an overidentified model is the minimization of the discrepancy between the observed covariance matrix and the covariance matrix implied (i.e., predicted) by the model. The observed covariance matrix, S, is assumed to reflect a population covariance matrix, Σ. The implied covariance matrix, Σ(θ), is the observed covariance matrix as a function of the parameters in the model gathered in θ. This notation is the basis for the basic null hypothesis expressed in population terms, Σ = Σ(θ).
Because the population values of free parameters in θ are not known, this hypothesis may be expressed as Σ = Σ(θˆ )
to indicate that some subset of parameters in θ is estimated. This term sometimes is written in shorthand as Σˆ . The population covariance matrix is not known either. As such, the covariance matrix based on data from a sample is used to test the hypothesis. Given this state of affairs, the hypothesis as evaluated in practice can be restated as ˆ. S=Σ S is derived from observed data. Σˆ must be generated through estimation.
37
03-Hoyle-4154-Ch-03.indd 37
11/01/2011 12:57:37 PM
structural equation modeling for social and personality psychology
Maximum Likelihood Estimation The earliest forms of contemporary SEM made use of maximum likelihood (ML) to obtain parameter estimates (e.g., Jöreskog, 1967; 1969), and ML has been the default estimator in SEM computer programs since the earliest versions of the LISREL program. The goal of ML estimation is to find a set of estimates for the free parameters that maximize the likelihood of the data given the specified model (see Myung, 2003, for additional information). This likelihood is maximized when the value of the fitting function, FML, is minimized: FML = log |Σ(θ)| + tr(SΣ−1(θ)) − log |S| − p
The astute reader will notice that embedded in this intimidating equation is the simple discrepancy between S and Σˆ discussed earlier. To the left of the minus sign is Σ(θ) and to the right of the minus sign is S. All else being equal, as the difference between Σ(θ) and S decreases, the value of FML decreases. ML estimation is an iterative procedure that begins with a somewhat arbitrary set of start values for the free parameters and updates these values until the difference between the observed covariance matrix and the covariance matrix implied by the model is minimized as reflected in the value of the ML fitting function. At this point, the estimation procedure is said to have converged, and the adequacy of the resultant model is evaluated. I illustrate the iteration and convergence process using data on a measure of individual differences in the propensity to self-handicap. The measure comprises 25 items. Only the seven that form the internal–claimed subscale are included in the analyses presented here. The covariance matrix on which the analysis is based is presented in the top panel of Table 3.1. The model is specified so that each of the seven indicators, v1 to v7, load on a single latent variable, F1. The loading of v1 on F1 is fixed at 1.0 in order to identify the variance of F1. Otherwise, no parameters are fixed apart from the customary zero covariances between uniquenesses. Thus, in addition to the one fixed parameter, the model includes 14 free parameters: the variance of F1, the loadings of v2 to v7 on F1, and the uniqueness associated with each indicator, e1 to e7. The number of parameters to be estimated is 14 fewer than the 28 unique elements in the observed covariance matrix. We cannot definitively determine whether the model is identified without demonstrating algebraically that a unique value for each parameter could be obtained through at least one manipulation of other parameters in the model. Minimally, however, we can determine whether the specification satisfies the necessary but not sufficient criteria outlined in Chapter 2. Because there are more than three indicators of the latent variable, the value of one of the loadings is fixed, and the number of free parameters does not exceed the number of elements in the observed covariance matrix, the model satisfies these criteria.
38
03-Hoyle-4154-Ch-03.indd 38
11/01/2011 12:57:37 PM
estimation and fit
Table 3.1 Observed, Implied, and Residual Covariance Matrices (N = 505) Observed Covariance Matrix v1 v2 v3 v4 v5 v6 v7
1.546 .506 .598 .472 .564 .471 .384
v1 v2 v3 v4 v5 v6 v7
1.546 .255 .587 .501 .624 .473 .393
v1 v2 v3 v4 v5 v6 v7
.000 .251 .012 -.029 -.060 -.002 -.008
v1 v2 v3 v4 v5 v6 v7
.000 .182 .007 -.019 -.036 -.001 -.006
1.231 .353 .264 .388 .271 .255
1.765 .683 .967 .753 .566
1.481 .805 .718 .509
1.808 .684 .701
1.504 .379
1.371
1.808 .752 .624
1.504 .473
1.371
.000 -.068 .077
.000 -.094
.000
.000 -.041 .049
.000 -.066
.000
Implied Covariance Matrix 1.231 .380 .325 .405 .307 .255
1.765 .748 .932 .707 .587
1.481 .796 .604 .501
Residual Matrix .000 -.028 -.061 -.017 -.036 .000
.000 -.066 .034 .047 -.020
.000 .009 .115 .008
Standardized Residual Matrix .000 -.019 -.045 -.011 -.026 .000
.000 -.041 .019 .029 -.013
.000 .005 .077 .006
The information in Table 3.2 offers insight into the ML estimation process. First, look at the rightmost column in the table. Labeled FML, values in this column are values of the ML fitting function based on the parameter estimates following each iteration. Notice that the value declines at a decreasing rate after each iteration, concluding with two values that do not differ at the fifth decimal place. Moving now to the first line of the table, start values are listed. No value of FML is listed because these are not estimates. Notice that the start value for six of the seven factor loadings is 1.00. It is important to recognize that the value of the loading for the first indicator, though it is the same as the value for the remaining loadings, is not a start value. Rather, it is a fixed value, which is evident because its value does not change from one iteration to the next. The start value for the variance of F1 also is 1.00. The start values for the uniquenesses (e1-e7), set by
39
03-Hoyle-4154-Ch-03.indd 39
11/01/2011 12:57:37 PM
03-Hoyle-4154-Ch-03.indd 40
Table 3.2 Change in Maximum Likelihood Parameter Estimates and Fitting Function as Iterative Process Converges on a Minimum Factor Loadings
Variances
Iter.
v1a
v2
v3
v4
v5
v6
v7
e1
e2
e3
e4
e5
e6
e7
F1
FML
Start 0 1
1.00 1.00 1.00
1.00 .67 .67
1.00 .73 1.00
1.00 .70 .90
1.00 .74 1.03
1.00 .70 .87
1.00 .69 .79
1.39 1.10 1.19
1.11 .83 1.06
1.59 1.29 .98
1.33 1.04 .88
1.63 1.33 .95
1.35 1.06 .96
1.23 .94 1.00
1.00 1.53 .36
.49017 .33074
2 3 4 5
1.00 1.00 1.00 1.00
.66 .65 .65 .65
1.69 1.51 1.50 1.49
1.42 1.29 1.28 1.28
1.80 1.60 1.59 1.59
1.34 1.22 1.21 1.21
1.07 1.01 1.00 1.00
1.18 1.16 1.15 1.15
1.07 1.07 1.07 1.07
.91 .89 .89 .89
.85 .84 .84 .84
.85 .81 .81 .81
.94 .93 .93 .93
.99 .98 .98 .98
.36 .39 .39 .39
.12662 .11524 .11519 .11519
Note: Estimates obtained using the maximum likelihood procedure in version 6.1 of EQS for Windows. Iter = iteration number. FML= maximum likelihood fitting function. a Loading fixed to achieve identification.
11/01/2011 12:57:37 PM
estimation and fit
default in the EQS computer program, are 90% of the observed variances of the variables (i.e., the diagonal in the observed covariance matrix). Other SEM programs use different default start values, and all programs allow the user to provide custom values. Importantly, although fewer or more iterations might be required for different sets of start values, the final set of estimates will be the same. ML estimation of the specified model required five iterations to converge. That is, after five iterations, the parameter estimates could not be updated so as to further minimize the value of the fitting function. Notice that, with the exception of the first column of parameter values, values of the parameters change from one iteration to the next. In some cases, as with the factor loading for v2, the value changes little after the first iteration. In other cases, as with the loading for v3, the value does not begin to converge on an optimal estimate until the third or fourth iteration. Iteration ceases when the value of the fitting function will not decrease with updated parameter estimates to a degree that exceeds a predetermined criterion, .00001 in this instance. The value of the fitting function at the point of convergence forms the basis for most fit statistics and indices. Attempts at estimation do not always result in convergence. In such cases, examination of the iteration history will find values for certain parameters increasing then decreasing from one iteration to the next, often taking on values that are implausible given the metrics of the variables involved. Similarly the value of the fitting function, rather than smoothly declining at a decreasing rate as in the example, may decline for several iterations before increasing. In such cases, the estimator is unable to find a unique set of parameter estimates that minimizes the value of the fitting function. Most SEM computer programs terminate the iteration process at a number that properly specified models rarely require (e.g., 30) and alert the user of a failure to converge. Failures to converge typically are the result of underidentification somewhere in the model. They may also result for large models that include many free parameters even when all parameters are identified. In such cases, the iteration history will show the expected pattern of declining values of the fitting function but values for the final two iterations indicating that the minimum has not been reached. In such cases, the researcher has two options: (1) substitute the values of the free parameters at the final iteration for the default start values (the equivalent of beginning estimation at the 30th iteration); or (2) override the default limit on number of iterations imposed by the computer program (an example of this is provided in Chapter 4). Parameter estimates and statistical tests are legitimate only for converged solutions. Even when estimation results in convergence, it may not yield proper parameter estimates and fit statistics. For this reason, before estimates and tests are taken as final, they should be examined carefully for implausible or anomalous values. The most common implausible value is the Heywood case, in which the estimated value of a variance (typically a uniqueness) is negative. Although some computer programs permit negative variance estimates, others require positive variance
41
03-Hoyle-4154-Ch-03.indd 41
11/01/2011 12:57:37 PM
structural equation modeling for social and personality psychology
estimates. Heywood cases are signaled in those programs by variances that are precisely zero, often accompanied by a message indicating the estimates were “constrained at lower bound.” Anomalous results often take the form of very large or small standard errors for parameter estimates. To illustrate, I intentionally introduced an underidentified parameter into the example model by freeing the factor loading fixed at 1.0 to identify the latent variable. Estimation of this model resulted in convergence after four iterations; however, the value of the standard error for one of the loadings, because it could not be estimated, was set to zero by the computer program. In this case, the evidence of underidentification was an anomalous value rather than a failure to converge. As with failure to converge, estimates and tests should not be reported when estimation yields implausible or anomalous values. ˆ , the The final set of parameter estimates are substituted in θ to produce Σ implied covariance matrix. The implied covariance matrix for the current example is shown in the second panel of Table 3.1. This matrix can be interpreted as what the population data would be if our model were the true model. As established earlier, our primary concern is the degree to which this matrix corresponds to the observed covariance matrix. Subtracting corresponding elements from the two matrices yields the residual matrix, shown in the third panel of Table 3.1. Notice that some values are positive and some negative, indicating that our model overestimates some covariances and underestimates others. For instance, the model underestimates the covariance of v1 and v2, which was .502 in the observed matrix but only .255 in the implied matrix. Because values in the residual matrix are covariances, which convey association with reference to the metrics of the variables, they are not readily interpretable. The final matrix shown in Table 3.1 is the standardized residual matrix, whose values are the equivalent of correlation coefficients. We can now see that v1 and v2 remain correlated at r = .18 after accounting for their association as reflected in the latent variable. Comparing this value to their zero-order correlation, r = .37, may offer insight into their performance as items on the scale or, as illustrated below, suggest ways to modify the model to improve its fit. The use of ML estimation is not always warranted. ML, like other normal theory estimators (e.g., generalized least squares), assumes no skewness or kurtosis in the joint distribution of the variables; that is, multivariate normality. Multivariate normality is not easily verified and corrected. The EQS computer program provides a statistical test of nonnormality in the fourth moment of the multivariate distribution, but, practically speaking, the test is more sensitive than the ML estimator to nonnormality. On occasion, multivariate nonnormality can be attributed to a small number of multivariate outliers, which can be detected using data visualization techniques such as those offered in ViSta (Young, 1996). Although a set of normally distributed variables does not necessarily yield multivariate normality, strategies for addressing nonnormality typically focus on individual variables,
42
03-Hoyle-4154-Ch-03.indd 42
11/01/2011 12:57:37 PM
estimation and fit
making the evaluation of normality at the univariate level worthwhile. In some instances, transformations of one or more variables or parceling of indicators will reduce multivariate nonnormality to a point that ML estimation is warranted (West, Finch, & Curran, 1995). ML estimation also assumes that variables are continuous. In reality, few variables in social and personality psychology are assessed on continuous scales. The norm is a set of five, seven, or nine response options. Fortunately, ML estimation is reasonably robust to violations of the assumption of continuous measurement, producing acceptable results with measures that include five or more categories (Johnson & Creech, 1983). Parceling indicators with too few categories is a means of producing variables that are suitable for ML estimation. Other assumptions are shared by all estimators. These include independence of observations, a sample that is sufficiently large to ensure the asymptotic properties of the estimator are realized, and a model that holds in the population. Although no estimator is robust to the independence assumption, simulation studies suggest that, for many modeling situations, samples do not have to be as large as originally assumed for valid estimation (200–400 for most models, and as low as 100 for small models that fit well and normally distributed data). Because the population model for a given set of data and variables is rarely known, the assumption that the specified model is the correct one is rarely subject to verification. Implicit is the assumption that, if a model yields acceptable fit statistics, it is a close enough approximation to the model in the population that parameter estimates and tests are valid.
Other Estimation Methods Data often do not meet the assumptions of ML estimation, forcing the researcher to turn to other estimation methods. These methods generally fall into two classes that differ according to which violation of normal theory estimator assumptions they address: estimators for nonnormal data and estimators for variables measured noncontinuously. When the data are not multivariate normal and the variables are continuous, researchers have two options. The first is to choose an alternative estimator that, by incorporating values of the third (skewness) and fourth (kurtosis) moments in estimation, does not assume normality (e.g., Browne, 1984). Data on these moments are gathered in a weight matrix, which is used in estimation. Because the weight matrix typically is large and its inclusion in estimation computationally intensive, distribution-free methods generally perform poorly except in samples that, by social and personality psychology standards, are very large (> 2000). Because the primary effect of estimation is not on the parameter estimates themselves, but rather on their standard errors and fit statistics, an alternative strategy is to estimate the parameters using a normal theory estimator such as ML, then
43
03-Hoyle-4154-Ch-03.indd 43
11/01/2011 12:57:37 PM
structural equation modeling for social and personality psychology
correct standard errors and fit statistics based on the degree of excess skewness and kurtosis in the data (e.g., Browne & Shapiro, 1987; Satorra & Bentler, 1994). This scaling correction produces robust standard errors and fit statistics that are corrected for nonnormality. Importantly, post-estimation corrections perform well in modest-sized samples such as those typical in social and personality research. A final alternative is to use bootstrapping methods to generate standard errors and fit statistics based on the distributions of the variables in the sample (Bollen & Stine, 1992). When variables are not measured continuously, which practically speaking means that four or fewer values are represented in the data, an estimator for categorical data must be used. The dominant approach to estimating models from categorical variables is Muthén’s (1983) categorical variable methodology (CVM). Now fully implemented in the Mplus computer program, CVM builds from distribution-free estimation by incorporating information about the metric and higher order moments of the data in the analysis. An alternative approach, offered in LISREL via PRELIS and EQS, is to generate and analyze polyserial (one variable is categorical) or polychoric (both variables are categorical) correlations, which are valid when a normally distributed continuous variable can be assumed to underlie ordered categories. An emerging strategy for working with categorical data is Bayesian estimation (Lee & Tang, 2006), now available in most stand-alone SEM computer programs.
Evaluation of Fit A byproduct of estimation is a number of values that are useful for judging the adequacy of a model. The most basic of these is the value of the fitting function, which approaches zero as the implied covariance matrix approaches the observed covariance matrix. Although a value of zero for the fitting function has clear meaning (the model perfectly recovers the observed data), other values do not. As such, the value of the fitting function is rarely consulted directly. Rather, it is used as the basis for a number of values to which statistical or rule-of-thumb criteria are applied. In addition, standard errors are generated for estimates of free parameters, and these are used to judge the degree to which specific parameters conform to predictions. The validity of different approaches to evaluating fit is a topic of considerable disagreement and debate (e.g., MacCallum, 2003; Meehl & Waller, 2002). Although norms have developed in the different disciplines and research literatures in which SEM is commonly used, there are no iron-clad rules for judging the adequacy of a model. In the remainder of this section, I describe a subset of the large number of indices and statistics, focusing on those that perform well with data and models typical of social and personality research and for which specific criterion values have been proposed and are generally endorsed by methodologists and reviewers.
44
03-Hoyle-4154-Ch-03.indd 44
11/01/2011 12:57:37 PM
estimation and fit
Omnibus Fit The most basic question to ask about a model with its fixed parameters and estimated free parameters is whether it offers an acceptable account of the observed data. When considered in this way, the evaluation concerns omnibus fit. Alternatives to omnibus fit are relative fit, in which two or more models are compared, and component fit, which concerns the magnitude and sign of individual free parameters in a model. The goal of tests or less formal evaluations of omnibus fit is the determination of whether the model reproduces the set of observed covariances at an acceptable level. The criterion on which this determination is based may be perfect fit (i.e., the implied and observed covariance matrices are identical) or close fit (i.e., the implied covariance matrix is an acceptable approximation of the observed covariance matrix). The oldest and, to some degree, most straightforward test of omnibus fit is the “χ2” test. I enclose χ2 in quotes because the numeric value used in this test is a derived value that only follows a χ2 distribution under specific conditions that are rarely met in practice. As such, although the test follows the well-known, if not well-accepted, null hypothesis statistical testing approach with which researchers in social and personality psychology are familiar and is of considerable historical significance, it is never used as the sole test of omnibus fit, and often is not used at all. The value of χ2 for an estimated model is computed rather simply as the product of the minimized value of the ML fitting function and the sample size minus 1.0. Referring back to the rightmost column in Table 3.2, we derive the observed value of χ2 for the example model as χ2 = .11519(505 − 1) = 58.06.
The number of degrees of freedom for determining the reference distribution is also simple to derive. The value is the number of unique elements in the observed covariance matrix, p(p + 1)/2, minus the number of free parameters in the specified model. The example model is based on seven variables and, as established earlier, includes 14 free parameters. Thus, degrees of freedom for the χ2 test are df = [7(7 + 1)/2] − 14 = 14.
Recall that the null hypothesis is expressed as H0: Σ = Σ(θˆ ).
Because the hypothesis indicates that the observed and implied covariance matrices are identical, the desired statistical outcome is a failure to reject the null hypothesis. Using the .05 probability level, the critical value for the example model is 23.69. If the result of the test is interpreted as intended, the probability of the null hypothesis is unacceptably low (well below .001) given the model and data.
45
03-Hoyle-4154-Ch-03.indd 45
11/01/2011 12:57:37 PM
structural equation modeling for social and personality psychology
For a number of reasons, this seemingly elegant test has fallen into disfavor. Most fundamentally, the product that yields the observed value of the statistic only follows a χ2 distribution under conditions that are rarely realized in practice (Bentler, 1990). Specifically, the value of the fitting function weighted by the adjusted sample size is only distributed as χ2 when the distribution of the data is multivariate normal, the sample size is large, and the model is correct in the population. Yet, even if these conditions were met, the test would nonetheless be of questionable value because it tests a hypothesis that, except in the case of perfect fit (i.e., all zeros in the residual matrix), can always be rejected with a sufficiently large sample, even when the implied and observed covariance matrices are only trivially different. Dissatisfaction with the “χ2” test motivated the development of various alternative fit indices, which now number more than 30. One useful class of indices is based on the comparison of the fit of a specified model to the fit of an independence, or null, model (Bentler & Bonett, 1980). Many of these indices are normed, meaning that their values, typically some type of proportion, fall between zero and 1.0. The basic logic is clear from inspection of the original form of this index, the normed fit index (NFI), which can be expressed as NFI =
2 2 independence − specified 2 independence
.
χ2independene indexes the fit of a model in which only variances are modeled; that is, no attempt is made to account for covariance between variables. χ2specified indexes the fit of the model of interest. When the fit of the specified model is perfect, yielding a value of zero for χ2specified , the value of NFI is 1.0. Conversely, when the value of χ2specified is large, reflecting a poor accounting for the observed covariances by the specified model, the value of NFI approaches zero. As such, NFI indexes the proportionate improvement in fit of a specified model over the independence model. Despite its intuitive appeal, NFI suffers from two drawbacks. Most fundamentally, because it is a descriptive index and not a statistic, there is no reference distribution or other objective basis for deciding the cutoff above which a model can be said to offer an acceptable account of the data. In addition, simulation studies of the performance of the index under various conditions indicated a problematic correlation with sample size such that the same model would yield higher values of NFI with larger samples. For this latter reason, NFI is no longer used to index model fit. Its historical significance should not be underestimated, however, as the basic logic was carried forward to a significant number of the alternatives in current use. The question of what value of NFI and similar normed indices should be interpreted as support for a model continues to be a concern. The norm for NFI
46
03-Hoyle-4154-Ch-03.indd 46
11/01/2011 12:57:38 PM
estimation and fit
became .90, a value that continues to be used for normed fit indices despite the lack of a clear statistical or logical justification (cf. Hu & Bentler, 1999). The logic, but not the performance concerns, of NFI is reflected in the comparative fit index (CFI; Bentler, 1990), the best documented of the many incremental, or comparative, fit indices developed in an attempt to improve on NFI. Unlike the NFI, which is based on the central χ2 distribution, CFI is based on the noncentral χ2, the more appropriate reference distribution for models other than the true population model. Noncentrality, which reflects degree of misspecification, is indexed as χ2 − df. Thus, the basic form of NFI, recast in noncentral χ2 terms as CFI, is calculated as CFI =
2 2 − dfindependence ) − ( χ specified − df specified ) ( χindependence 2 χindependence − dfindependence
.
Discounting sampling error, a perfect-fitting model will yield a χ2 value equal to its degrees of freedom, which results in subtracting zero from the noncentrality parameter for the independence model and a value of 1.0 for CFI. Although it is still common practice to use .90 as a cutoff for acceptable fit of a model, simulations suggest that this should be an absolute lower bound. Indeed, the rigorous evaluation of fit should use .95 as the criterion (Hu & Bentler, 1999). Even when .90 is used as the general criterion, it is recommended that .95 be used for evaluation of the measurement component of the model (Mulaik & Millsap, 2000). Estimation of the example model yielded a value of .948 for CFI, just below the recommended cutoff. Another index of fit based on the noncentral χ2 distribution is the root mean square error of approximation (RMSEA; Steiger & Lind, 1980). Unlike CFI, RMSEA indexes absolute fit. Although this logic is somewhat similar to the traditional χ2 test, RMSEA differs in two important ways: (1) it is interpreted with reference to close, not perfect, fit; and (2) it includes a parsimony correction, a penalty for lack of parsimony in a model (as with adjusted R2 in multiple regression analysis). RMSEA is calculated as RMSEA =
χ2 − 1. df NN−–1 1
As noted earlier, when a model perfectly accounts for the data, the resulting value of χ2 will, within sampling error, equal the degrees of freedom for the model. Referring to the numerator of the formula, a perfect-fitting model would yield a numerator and, consequently, a total value of zero. The focus on close, rather than perfect, fit is reflected in the general acceptance of .05 rather than zero
47
03-Hoyle-4154-Ch-03.indd 47
11/01/2011 12:57:38 PM
structural equation modeling for social and personality psychology
as the value of RMSEA indicating acceptable fit (though values from zero to .05 certainly indicate acceptable fit). The range of .05 to .08 is interpreted as close fit, .08 to .10 as marginal fit, and greater than .10 as unacceptable fit (Browne & Cudeck, 1993). Because the standard error of RMSEA is known, it is possible to put a confidence interval on the point estimate. It is now standard practice to report the confidence limits (CL) for the 90% confidence interval on the point estimate of RMSEA, focusing primarily on the upper limit with reference to the suggested cutoffs. For the example model, the point estimate of RMSEA is .079, just below the upper bound for close fit. The 90% confidence interval ranges from .058 to .100. Because the upper bound extends beyond .08, there is reason to believe the fit of the model could be improved through modifications. In the next chapter, in the section on model modification, I examine ways to improve the fit of the model. The values I have presented for χ2, CFI, and RMSEA are based on ML estimation, which assumes the joint distribution of the observed variables is multivariate normal. Examination of the univariate distributions indicates excessive negative kurtosis (less peaked than normal) in the distributions of two of the seven variables. One means of examining the degree to which this nonnormality affects the values of the omnibus fit statistics is to apply a post-estimation correction, and compare the two sets of statistics. The robust, or scaled, statistics (Satorra & Bentler, 1994) are produced by dividing the normal theory statistics (typically produced by ML) by a scaling factor that reflects the degree of departure from normality. The greater the degree of nonnormality in the data, the more favorable the scaled statistics will be relative to the normal theory statistics. In fact, this comparison offers an informal means of determining the degree to which normal estimation is appropriate. The greater the departure of the ratio of the scaled statistics to normal theory statistics from 1.0, the greater the effects of nonnormality on estimation. For example, the scaling correction produces a value of 44.73 for χ2 (ratio of 1.30), .961 for CFI (ratio of 1.01), and .066 (90% CL = .045, .088) for RMSEA (ratio of 1.20). The ratios indicate relatively little effect of nonnormality on CFI but nontrivial effects on χ2 and RMSEA. The adjustments are enough, however, to move values of all of the omnibus fit statistics into the acceptable range. Although there is by no means consensus on which indexes of fit should be used to judge the fit of a model, evaluations to date of the performance of the large number of extant indices suggest that CFI and RMSEA perform near theoretical expectations in samples and with data typical of social and personality research. For CFI, a cutoff value of .95 is recommended, especially for evaluations of the measurement component of models. For RMSEA, an upper bound on the 90% confidence interval of no greater than .10, and preferably no greater than .08, is recommended. Although the traditional χ2 test is not useful for judging whether a model offers an acceptable account of a set of data, I recommend reporting it. As is evident from the equations for CFI and RMSEA,
48
03-Hoyle-4154-Ch-03.indd 48
11/01/2011 12:57:38 PM
estimation and fit
that value can be used to generate other fit indices and is used for comparisons of nested models.
Fit Relative to Alternatives As noted in Chapter 2, the strictly confirmatory approach to SEM, though admirable on the face of it, is rarely used. The most persuasive uses of SEM are those in which interesting alternatives to the proposed model are specified and compared against it. The strongest form of this strategy involves specifying one or more models that are nested in the proposed model. A model is nested in another model if its free parameters are a subset of the free parameters in the target model (i.e., one or more parameters that were free in the target model are fixed in the nested model). Because of the simplicity of the example, relatively few interesting alternatives are possible. More complex models typically allow for specification of multiple interesting alternatives. Nonetheless, to illustrate the logic of tests of relative fit, imagine that I wished to evaluate the specified model against one in which the seven items are equivalent in their reflection of the latent variable. To begin, I could respecify the model of interest, fixing the variance of the latent variable to 1.0 rather than fixing the loading of the first variable to 1.0. (This distinction was discussed in Chapter 2.) The first loading is now a free parameter. This model would yield the same fit statistics as those reported earlier. Next, I can specify a model in which the seven loadings are constrained to be equal. This constraint, in effect, treats the seven loadings as one free parameter. As such, the modified model has six more degrees of freedom than the target model. Such constraints always result in an increase in the value of χ2, in much the same way that dropping predictors from a multiple regression equation (i.e., fixing their coefficients to zero) results in a decline in the value of R2. The question is whether the increase in χ2 is offset by the increased parsimony of the model. This question is addressed by the χ2 difference test for nested models. The difference between the χ2 for two nested models is itself a χ2, which is evaluated based on the difference in degrees of freedom. The model with equality constraints yields a value of 147.18 for χ2. The difference between this and the χ2 for the target model (58.06), denoted as Δχ2, is 89.12. When compared against the critical value of χ2 for p = .05 and df = 6, that is 12.59, the value of Δχ2 is highly significant. This indicates that the fit deteriorated significantly when the equality constraints were imposed. The target model, in which loadings were free to vary, provides a better account of the data than this alternative. It bears mention that the difference between values of the χ2 for nested models is not χ2 distributed when, as described earlier, they have been rescaled to correct for the effects of nonnormality (Satorra & Bentler, 1994). Thus, had we elected to report scaled fit statistics for the specified and alternative model, comparing them would not have been as simple as subtracting their two values for χ2. The comparison
49
03-Hoyle-4154-Ch-03.indd 49
11/01/2011 12:57:38 PM
structural equation modeling for social and personality psychology
of nested models when the scaled χ2 are used requires a slightly more complicated calculation that includes the scaling correction for the two models (Satorra & Bentler, 2001); this calculation is illustrated with an example in Chapter 4.
Component Fit Models that provide an acceptable account of a set of data can be further evaluated in terms of the parameter estimates. The evaluation of component fit considers whether the sign and magnitude of the parameter estimates correspond to predictions. The estimate of each free parameter is accompanied by a standard error. The ratio of the estimate to its standard error, the critical ratio, is evaluated with reference to the z distribution. Assuming use of the standard .05 p-value, critical ratios in excess of 1.96 indicate values of parameter estimates that differ from zero. An alternative null value may be subtracted from the numerator in order to test hypotheses other than difference from zero. Although researchers in social and personality psychology will likely focus their attention on tests of factor loadings and structural paths, it bears noting that a critical ratio is generated for every free parameter, allowing potentially interesting tests of other parameters such as variances of latent variables and uniquenesses.
Statistical Power The issue of statistical power and evaluations of fit in SEM is complex. Approaches to power analysis in SEM are of two types. One focuses on the power of the χ2 test to detect misspecification of specific parameters in the model (Satorra & Saris, 1985). This approach is tedious and not suitable for determining the power of tests of omnibus fit. More easily derived alternatives based on similar logic have been proposed (e.g., Kaplan, 1990). Yet, despite their usefulness for evaluating power for tests of, or modifications to, specific parameters in a model, they are of limited use to researchers attempting to justify a proposed study based on a given sample size and a specified model. A second type of power analysis focuses on omnibus fit. A generalization of the χ2-based procedure is available (Saris & Satorra, 1993); however, it requires high-level information about the parameter estimates that is not easily produced from output generated by SEM computer programs. The most accessible strategy for evaluating the statistical power for an implementation of SEM is based on RMSEA (MacCallum, Browne, & Sugawara, 1996). As shown earlier, the computation of RMSEA is based on the value of χ2, degrees of freedom, and sample size. Because, in the numerator, χ2 is divided by df, larger values of df result in a smaller numerator and, all else being equal, a smaller value of RMSEA. N appears in the denominator; thus, for a given pair of values for χ2 and df, larger values of N yield smaller values of RMSEA, and
50
03-Hoyle-4154-Ch-03.indd 50
11/01/2011 12:57:38 PM
estimation and fit
therefore a greater likelihood of finding close fit in tests of a model. Tables for selected degrees of freedom, sample sizes, and levels of power are provided by MacCallum et al. (1996). Alternatively, power analyses for tests of omnibus fit using RMSEA can be accomplished using a Web-based application at http:// people.ku.edu/~preacher/rmsea/rmsea.htm. Referring to the running example, for our model with 14 degrees of freedom, in order to have power of .80 for the test of close fit (using .05 and .08 for the null and alternative values, respectively) using RMSEA, we would need a sample size of 585. Our sample size of N = 505 yields a power estimate of .74 for the test of close fit.
A Note on Missing Data In general, concerns raised by missing data are no different for SEM than for other statistical methods. Given the relative complexity of models in SEM, however, methods for effectively managing missingness that work well for simpler methods such as ANOVA and multiple regression analysis (e.g., multiple imputation) are not as easily applied to SEM. As with any method, listwise and pairwise deletion of cases is not acceptable except in the unlikely and unverifiable instance of a missing-completely-at-random pattern. Pairwise deletion is particularly problematic for implementations that use the covariance matrix as input, because it results in covariances based on different subsamples and, on occasion, implausible and statistically inadmissible patterns of covariances (Arbuckle, 1996). A subset of computer programs for SEM offers full information maximum likelihood (FIML), an estimator-based approach to managing missing data. Graham, Olchowski, and Gilreath (2007) have shown that FIML is equivalent to generating an infinite number of imputations using multiple imputation methods. Assuming it is offered by the computer program used to estimate a model, FIML is easily implemented and produces good results when the pattern of missingness is missing at random or when the pattern is missing not at random but auxiliary variables are included in the model (Collins, Schafer, & Kam, 2001).
Computer Programs Gone are the days when the use of SEM required access to and knowledge of a single computer program, LISREL, and knowledge of matrix notation in order to specify models. Although LISREL is now one in a growing array of computer programs for estimating models, its singular impact on the spread of interest in SEM cannot be underestimated. In fact, so closely associated were SEM and the LISREL software that SEM was (and still is in some quarters) referred to as the LISREL model or, simply, LISREL. As recounted by Sörbom (2001), the original LISREL software, which was introduced in 1970, was capable of modeling
51
03-Hoyle-4154-Ch-03.indd 51
11/01/2011 12:57:38 PM
structural equation modeling for social and personality psychology
only single-indicator latent variables (i.e., uniqueness fixed at zero and loading fixed at 1.0). LISREL II, released in 1971, allowed the user to model latent variables with multiple indicators. Sörbom joined Jöreskog to produce LISREL III (1974), which added the capacity to model means. LISREL IV (1978), V (1981), and VI (1984) added additional estimators to the standard ML as well as greater ease of use and more detailed output. LISREL 7 (1988) was accompanied by PRELIS, which allowed for data manipulation and basic analyses prior to SEM analyses. The current version of LISREL, LISREL 8 (1993), added SIMPLIS, a simplified syntax system by which models are specified in equation form. The history of the LISREL computer program reflects the evolution of SEM from an arcane, fairly limited statistical approach accessible only to researchers with training in mathematical statistics, to a wide-ranging statistical approach accessible to researchers with standard training in inferential statistics. Listed below are the computer programs currently available for implementing SEM. The programs are grouped according to whether they are commercial or available free of charge and, within the set of commercial programs, whether they are primarily for SEM or offered in more general data analysis programs. For three reasons, I do not provide detailed descriptions, samples of syntax, and the like. First, the number of programs and their capabilities have increased to a point that meaningful coverage of the full array of programs would require more pages than can be given to the topic in this book. Second, many of the computer programs are regularly updated, often to such an extent that detailed descriptions would, in some cases, be out of date before this book goes to print. Finally, it is our good fortune that detailed descriptions of most of the programs are provided in books, journal articles, and websites. The latter often offer access to no-cost demonstration versions of commercial programs, allowing prospective users to “test-drive” before deciding whether to purchase. Addresses for accessing such resources and citations to relevant books and articles are provided. Commercial Stand-Alone Programs: •• •• •• •• •• •• •• •• ••
AMOS (Blunch, 2008; Byrne, 2010) www.spss.com/amos/ EQS (Byrne, 2006; Mueller, 1999) www.mvsoft.com/ LISREL (Byrne, 1998; Kelloway, 1998; Mueller, 1999) www.ssicentral.com/lisrel/ Mplus (Byrne, 2011) www.statmodel.com/ STREAMS (used for data preprocessing, specification, and output; requires AMOS, EQS, LISREL, or Mplus for estimation) •• www.mwstreams.com/
52
03-Hoyle-4154-Ch-03.indd 52
11/01/2011 12:57:38 PM
estimation and fit
Commercial Data Analysis Programs with SEM Capability: •• SAS TCALIS •• http://support.sas.com/documentation/cdl/en/statugtcalis/61840/PDF/default/ statugtcalis.pdf •• Stata GLLAMM (Rabe-Hesketh, Skrondal, & Pickles, 2004) •• www.gllamm.org/ •• Statistica SEPATH •• www.statsoft.com/products/statistica-advanced-linear-non-linear-models/ itemid/5/#structural •• Systat RAMONA •• www.systat.com/
No-Cost Stand-Alone Programs: •• •• •• •• •• •• •• ••
AFNI 1dSEM package http://afni.nimh.nih.gov/sscc/gangc/PathAna.html OpenMx http://openmx.psyc.virginia.edu/ R sem package (Fox, 2006) http://cran.r-project.org/web/packages/sem/index.html SmartPLS (estimation by the method of partial least squares only) www.smartpls.de/forum/
It bears noting that formal training in SEM for students of social and personality psychology has not kept pace with the accessibility and capability of SEM computer programs (Aiken et al., 2008). As such, social and personality researchers are now in a position of being able to specify and estimate models they do not fully understand (Steiger, 2001). Readers new to SEM are encouraged to acquire at least foundational knowledge of SEM before choosing a computer program to implement it. A concerning pattern evident in the software-oriented workshops that have become fashionable is the failure to distinguish learning to use an SEM computer program from learning to use SEM. A reasonable compromise for prospective users not in a position to acquire formal training in SEM is the use of resources that combine the presentation of SEM with the presentation of a computer program for implementing it (e.g., Byrne, 1998, 2006, 2010, 2011).
53
03-Hoyle-4154-Ch-03.indd 53
11/01/2011 12:57:39 PM
4
Modification, Presentation, and Interpretation
In rare instances, the evaluation of fit indicates that a specified model provides an acceptable account of the data. More typically, the evaluation of fit indicates that the model falls short as a potential explanation of the data. In such cases, the researcher has two options: follow the strictly confirmatory approach and consider the model rejected and the analysis complete, or follow the model generating approach and consider ways in which the model might be modified to produce a plausible model that yields acceptable values of fit statistics. If either the original model or a modified version of it produces fit statistics that could be used to argue in support of the model, the implementation of SEM turns to presentation and interpretation of the results. In this chapter, I treat each of these activities in turn. In the first part of the chapter, I discuss the activity of model modification with a particular emphasis on the potential for inferential errors. I then offer practical suggestions for effectively presenting the results of an SEM analysis using an example. I conclude the chapter, and the portion of the book on steps in the implementation of SEM, with a discussion of the interpretation of results.
Model Modification As with any statistical method, the move from a priori to post hoc hypothesis testing increases the likelihood of Type I error – capitalization on chance. In the case of SEM, this takes the form of accepting a model as offering an acceptable account of the data in the population when, in reality, it includes features that apply only to data from the sample on which estimates are based. In such cases, the likelihood of replicating support for those features of the model using data from another sample from the same population is relatively low. Assuming that the observed data are representative of (e.g., drawn randomly from) the population, the likelihood of Type I errors (i.e., modifications that do not generalize to the population) increases as sample size decreases. In terms of model modification in SEM, the question for researchers in social and personality psychology is: How large must a representative sample be in order to ensure the detection of modifications that reflect the population model rather than a
04-Hoyle-4154-Ch-04.indd 54
11/01/2011 12:58:01 PM
modification, presentation, and interpretation
model optimized for the sample of data at hand? This question was addressed empirically in a simulation study by MacCallum et al. (1992). These authors began by estimating a model using data from a large sample (N = 3694). Values of omnibus fit indices indicated promising but not acceptable fit of the data to the model (e.g., value of .875 for a normed fit index). Automated specification searching (described below) using data from the full sample indicated that model fit could be improved by freeing four parameters fixed at zero in the originally specified model (resulting in a value of .930 for the fit index). The authors reasoned that, if the observed data are considered population data and the modified model an acceptable approximation of the true model in the population, then samples of varying sizes could be drawn from the full sample (i.e., population) and used to determine the likelihood of finding the correct modifications to produce the true model with different sample sizes when specification begins with the original model. MacCallum et al. (1992) drew 10 random samples each of sizes 100, 150, 200, 250, 325, 400, 800, and 1200, and used them to estimate parameters in the original model and evaluate the degree of correspondence between the first four modifications suggested by an automated specification search and the four known to apply in the population. Focusing first on omnibus fit after the four modifications identified by specification searching were applied, for samples of 100 and 150, the average value of the fit index did not reach a liberal threshold of model fit (normed valued of .90). Even samples of size 325 produced a range of values that, at the lower end, fell well below the threshold of model fit. Only N of 800 and 1200 consistently produced fit statistics that would lead to acceptance of the model. An evaluation of the modifications recommended by automated searching reveals the basis for these results. Of the 40 replications with N of 250 or less, none of the searches yielded the four modifications known to apply in the population. The likelihood of finding the true modifications by automated specification searching was only 1 in 10 for samples of 325, toward the high end of the range of sample sizes typical of social psychological research and about the average in personality research. The likelihood of finding the true modifications increased with increasing sample size, peaking at 6 in 10 for N = 1200. These findings are a striking illustration of the troublingly high likelihood of producing a modified model that offers acceptable fit to sample data but does not approximate the true model in the population. If sample size is large or replication is a possibility, disciplined model modification is a worthwhile endeavor that can yield new discoveries. There are two primary approaches to specification searching, the activity of finding the changes to the specification of a model that would improve its fit to the observed data. Manual searching relies on careful inspection and interpretation of output from the analyses. Automated searching relies on the computer program used to estimate the model to consider potential modifications and report the outcome if they were included in a respecified model. In the remainder of this section I review
55
04-Hoyle-4154-Ch-04.indd 55
11/01/2011 12:58:01 PM
structural equation modeling for social and personality psychology
these two approaches to deciding how an originally specified model that does not provide an acceptable account of the observed data can be modified until it meets the chosen fit criteria.
Manual Specification Searching One approach to model modification is manual specification searching, which primarily involves examination of the residual matrix for clues as to misspecification in the original model. Misspecification is evidenced by large values of residual covariances relative to their counterparts in the observed covariance matrix. Residual covariances may be positive or negative. Because residual covariances are obtained by subtracting the implied covariances from the observed covariances, positive values indicate covariances not adequately explained by the specified model. Negative covariances indicate an over-accounting of the observed covariances by the model. It is important to keep in mind that, unlike correlation coefficients, covariances have no readily interpretable metric apart from the rather complex consideration of the metrics of the variables whose covariance they index. As such, when the focus is residual covariances, the recommended strategy is a comparison of the implied and observed covariances accounting for any difference in sign. For example, if the observed covariance is 112 and the implied covariance is 56, then the residual covariance of 56 indicates that half of the observed covariance is not explained by the model. Alternatively, if the observed covariance is 908 and the implied covariance is 852, the residual covariance of 56 indicates that less than 15% of the observed covariance is not accounted for by the model. As noted in Chapter 3, some SEM computer programs provide a standardized residual matrix, in which elements off the diagonal can be interpreted as correlation coefficients. Although the values of the residuals can now be interpreted with reference to a well-known metric, it is nonetheless beneficial to evaluate these values with reference to the observed data. Doing so requires generation of the zero-order correlation matrix for the observed data. I illustrate manual specification searching by returning to the one-factor model of the internal-claimed dimension of the propensity to self-handicap used to illustrate estimation and fit in Chapter 3. Recall that, although the recommended fit statistics and criteria suggested that the model provided an acceptable fit to the data, χ2(14, N = 505) = 58.06, p < .001; CFI = .948; RMSEA = .079 (90% CL = .058, .100), I noted that the residual for the first two indicators was large relative to its value in the observed covariance matrix (residual covariance = .251, observed covariance = .506). When standardized, the correlation between these indicators was .18 after accounting for the latent variable. Thus, the latent variable accounted for only about half of the correlation between these indicators (zero-order r = .37). Put differently, these two indicators appear to share something
56
04-Hoyle-4154-Ch-04.indd 56
11/01/2011 12:58:01 PM
modification, presentation, and interpretation
in common that they do not share with the remaining indicators. Indeed, an examination of the items indicates that these two involve claiming interference by emotions as a means of self-handicapping (“… let my emotions get in the way,” “… let emotional problems in one part of my life interfere with other things in my life”), whereas the other five items do not (e.g., “… get minor aches and pains before a big evaluation”). The most straightforward means of accounting for this unexplained covariance is to free the covariance between the uniquenesses of the first two indicators. The addition of this free parameter to the originally specified model is shown in Figure 4.1. The originally specified model is nested in the respecified model because its parameters are a subset of those in the respecified model. As such, the modification can be tested using the χ2 difference test. Because the respecified model has one more free parameter than the original model, the χ2 difference is tested against the one degree of freedom reference distribution. The value of χ2 for the respecified model is 28.64, which yields a value of Δ χ2 = 29.42. This value is highly significant, indicating that the restriction on the covariance in the original model is not warranted. The covariance between the first two indicators is now fully accounted for, in part by the latent variable, and in part by the covariance between their uniquenesses. This is evident in the residual matrix, within which the covariance for the first two indicators is now zero. Recall from Chapter 3 that I found some evidence of nonnormality in the data, as evidenced by excessive kurtosis in two of the variables and a more favorable value of the fit statistics after they were corrected for nonnormality. As such,
*
F1
1
v1
*
*
*
*
*
*
v2
v3
v4
v5
v6
v7
e1
e2
e3
e4
e5
e6
e7
*
*
*
*
*
*
*
*
Figure 4.1 One-factor model with a pair of correlated uniquenesses
57
04-Hoyle-4154-Ch-04.indd 57
11/01/2011 12:58:02 PM
structural equation modeling for social and personality psychology
I would report corrected, or scaled, statistics in a presentation of the findings. I also established that, when using scaled fit statistics, the comparison of nested models involves more than simply subtracting the χ2 value for the less restricted model (i.e., the one with fewer degrees of freedom) from the χ2 value for the other model. When using scaled statistics, the standard χ2 difference is adjusted to account for the scaling corrections to the two χ2 (Satorra & Bentler, 2001). The formula is scaled − ∆ χ 2 =
∆χ 2 χ 02 χ12 df 0 scaledχ 2 − df1 scaledχ 2 / ∆df 0
,
1
where Δχ2 and Δdf are the standard, normal theory differences, and subscripts 0 and 1 denote the less and more restricted models, respectively. (A computer program for computing this value can be downloaded without cost from www.abdn. ac.uk/~psy086/dept/sbdiff.htm) Note that, when the data are multivariate normal (i.e., the third and fourth moments are zero), the ratios of the normal and scaled χ2 are 1.0, leaving a value of 1.0 (Δdf/Δdf) for the denominator of the equation and an overall value equal to the difference between the normal theory χ2. Comparing the scaled χ2 for the original and respecified models using this formula yields a value of scaled Δχ2 = 17.49, which, on one degree of freedom, is highly significant and, therefore, statistical evidence in favor of the respecified model.
Automated Specification Searching The translation of relatively large residual covariances to specific modifications of a model requires a deeper understanding of SEM than many otherwise skilled researchers bring to the analysis. Furthermore, for complex models with many variables and, therefore, many alternative specifications, even skilled and experienced users of SEM may find manual specification searching impractical. In such cases, an alternative strategy to model modification is automated specification searching. The logic of automated specification searching can be illustrated using an example with which many researchers in social and personality psychology would be familiar – the use of coefficient alpha to evaluate candidate items for a measure. Reliability analyses provided by standard computer programs provide not only the value of alpha for the full set of items, but also the value of alpha if each item were individually dropped from the set. For example, the value of alpha for the full set might be .69, but the value of alpha would be .80 if the third candidate item were dropped. In the same way that this information might be used to produce a scale with acceptable psychometric properties, researchers might use information provided by an automated specification search to produce an acceptable model.
58
04-Hoyle-4154-Ch-04.indd 58
11/01/2011 12:58:02 PM
modification, presentation, and interpretation
The equivalent of the alpha-if-item-deleted value is the modification index (Sörbom, 1989; equivalent to the Lagrange multiplier test, Bentler, 1986). The modification index, in effect, treats each fixed parameter in a model as candidate items are treated in evaluations of the internal consistency of a new measure, asking: How would the omnibus fit of the model change if the parameter were freed? The value of the modification index is the equivalent of the change in χ2 between a model in which the parameter is fixed (the original model) and one in which it is free (the model that would result were it freed). Following that logic, any value larger than 3.84, the critical value of χ2 on one degree of freedom, indicates a significant improvement in omnibus fit if the parameter is freed. Some computer programs provide modification indices for all fixed parameters, whereas others provide only those that exceed some specified criterion (e.g., p < .05). As with the use of alpha-if-item-deleted values to finalize a set of items for a new measure, modification indices indicate the change in fit that would result from freeing previously fixed parameters one at a time. Because, like candidate items for a new measure, parameters in a model are correlated, changing the status of one parameter will change the estimated values of other parameters in the model. As such, when one parameter is freed and the model re-estimated, values of modification indices for the remaining fixed parameters will change. One implication of this property of modification indices is that one should not free more than one parameter at a time based on a single set of modification indices. So doing, risks producing a modified model in which some parameter estimates are nonsignificant because the covariance they would have accounted for if freed is accounted for by parameters freed before them. Another implication is that the order in which individual parameters with significant modification indices are freed may influence the specification of the final model. Both of these concerns are addressed by a multivariate approach to automated specification searching using the modification index (Bentler & Chou, 1986). In the most straightforward form of this approach, the computer program evaluates change in model fit in a sequential manner, beginning with the fixed parameter that would yield the greatest improvement in omnibus fit, then, with that parameter freed, moving to the next largest fit index. This sequential process continues until all fixed parameters in the original model that are subject to modification have been freed. This multivariate approach offers two advantages over the univariate approach: (1) the researcher can readily see the effect of freeing parameters with large modification indices on those with smaller modification indices; and (2) because the expected change in χ2 is incremented at each step in the sequence, the researcher can easily determine the point at which adding addition modifications to a set would yield no significant improvement in omnibus fit. Because, in many cases, a residual covariance can be reduced in multiple ways, a key consideration in automated specification searching is the determination of which parameters will be included in the search. For instance, imagine a twofactor measurement model, with v1 to v5 loading on F1 and v6 to v10 loading on F2.
59
04-Hoyle-4154-Ch-04.indd 59
11/01/2011 12:58:02 PM
structural equation modeling for social and personality psychology
Estimation yields fit statistics that do not meet fit criteria on the indices of choice. Now assume that the residual covariance between v1 and v6 is relatively large. As shown earlier, one modification that would reduce the residual covariance, thereby improving fit, is to free the covariance between e1 and e6. Imagine, however, that a significant portion of the unexplained covariance between v1 and v6 is that portion of variance in v1 explained by F1. Also note that, in the original specification, not only were the covariances between uniquenesses fixed at zero, but the loading of each variable on the other factor was fixed at zero as well. As such, an alternative modification would be to free the loading of v6 on F1, thereby accounting for that portion of covariance between v1 and v6 by F1. An automated specification search that did not include cross-loadings would point to freeing the covariance between the two uniquenesses, whereas a search that did not include covariances between uniqueness likely would point to freeing the cross-loading. Another example of this consideration highlights a creative and sometimes revealing modification when the specific component of uniquenesses associated with indicators of latent variables is of potential interest. Imagine an endogenous latent variable whose indicators are symptoms (e.g., depression) or different manifestations of a general problem (e.g., problem behavior). Presumably, these indicators are influenced to a significant degree by the latent variable; however, it is possible, even likely, that a significant and interesting proportion of their variance is not shared with other indicators of the latent variable. For instance, problems with sleep may be, in part, attributable to depression, but they may arise for other reasons (e.g., stress, physical pain). Potential reasons can be considered empirically by allowing the uniqueness (itself a latent variable) to correlate with other variables in the model (see Newcomb, 1994, for an example). Such paths are termed specific effects because they involve the specific (i.e., nonrandom) component of what might otherwise be termed measurement error in the indicator. For indicators that are, in effect, replicates of each other, specific effects make little sense. However, for high-bandwidth constructs, whose indicators are likely to include significant specificity (Briggs & Cheek, 1986), specific effects may be of considerable substantive interest. Importantly, however, such effects are rarely specified a priori and, unless the computer program used for automated specification searching is capable of considering these effects and instructed to do so, they will not be considered as potential modifications to the original model. To this point I have highlighted only one alternative in model modification: freeing parameters that are fixed in the originally specified model. A second, less consequential, alternative is to fix parameters that, in the original model, are free. This strategy virtually always involves fixing the values at zero, the null value against which estimates of free parameters are tested by the critical ratio. As such, a search for such modifications that would improve model fit can be as simple as examining the statistical tests of individual parameter estimates. An alternative is an automated strategy based on the Wald test (Bentler, 1986). An advantage of this approach is that it provides the increment in χ2 that fixing a previously free
60
04-Hoyle-4154-Ch-04.indd 60
11/01/2011 12:58:02 PM
modification, presentation, and interpretation
parameter would yield. Because fixing a previously free parameter adds a restriction to a model, the value of χ2 will increase. The goal of modification by this strategy is a nonsignificant increase in χ2 (i.e., less than 3.84). A second advantage of this approach is that it can be done at the multivariate level, considering the full set of free parameters simultaneously or incrementally (Bentler & Chou, 1986). Fixing previously free parameters rarely results in dramatic improvements in fit of a model; however, by reducing the number of free parameters, thereby increasing the degrees of freedom, it can yield more favorable values of fit statistics that impose a penalty for lack of parsimony such as RMSEA.
Presentation An implementation of SEM typically proceeds to the presentation stage when evidence in support of a model has been obtained from either the original specification (including comparisons with alternative models) or modifications of it. Typically, the presentation is in the context of a manuscript to be submitted to a journal for publication, which means that not all of the wealth of information that might be presented can be included. As a result, the researcher reporting on an implementation of SEM and its results must balance the typical journal requirement to exercise restraint in reporting statistical results with the need to provide all the information necessary for researchers to understand what was done and why. In this section, I use an example to highlight important information to be presented in reports of results from SEM analyses and suggest efficient and effective means of presenting the information.
Model Description The use of SEM begins with specification of a model. Specification typically is grounded in theory; prior tests of the model or similar models; or relevant published findings. For this reason, the model typically is described at two points in the presentation of an SEM analysis: (1) in the introduction, where the focus is on constructs and theoretical relations; (2) in the description of the plan of analysis, where the focus is on the translation of the conceptual model to a specified model. Although a path diagram may or may not prove useful in the presentation of the conceptual model, it can be highly effective in the presentation of the model to be estimated. It is important in the presentation of the model to be estimated that an accurate and complete accounting for fixed and free parameters in the model is provided. A byproduct of this exercise is a statement of the number of degrees of freedom on which tests of fit will be based. To illustrate this aspect of presentation and the remaining material covered in this section, I refer to an example representative of research in social psychology. The example concerns the effect of individual differences in the contingency of
61
04-Hoyle-4154-Ch-04.indd 61
11/01/2011 12:58:02 PM
structural equation modeling for social and personality psychology
self-esteem on appearance on state self-awareness (Duvall, Hoyle, & Robinson, 2010). My collaborators and I reasoned that individuals high in contingency on appearance would be particularly sensitive to the salience of appearance-related cues in the environment. This should be evidenced by a temporary increase in selfawareness when confronted with such cues. In order to generate data relevant to these hypotheses, we designed an experiment involving college-age women who had completed a measure of contingency on appearance (Crocker, Luhtanen, Cooper, & Bouvrette, 2003) several weeks earlier. Women from the upper and lower thirds of the distribution on contingency were randomized to one of two levels of appearance salience. In the high-salience condition, they learned immediately on arrival at the site of the experiment that a photograph would be taken for use later in the study. In the low-salience (i.e., naturally occurring) condition, no mention was made of the need for a photograph. Participants were then given a set of questionnaires to complete. Near the end of the set was a brief, ad hoc questionnaire asking them to indicate how much, if at all, they had thought about various things during the session. Embedded in the list was “how you look,” which served as our measure of appearance salience. The final questionnaire was a state version of the public subscale of the Self-Consciousness Scale (Fenigstein, Scheier, & Buss, 1975). Our conceptual reasoning suggested a model in which individual differences in the contingency of self-esteem on appearance interact with the presence or absence of an appearance cue to influence the salience of appearance, which, in turn, influences awareness of the public self. Although I might describe this model in conceptual terms with reference to a path diagram, the model specification required to properly estimate the model raises concerns not directly related to the conceptual hypothesis. As such, I would rely on a written description to communicate the model in conceptual terms and use a path diagram to communicate the model to be estimated using SEM. A path diagram of the model to be estimated is presented in Figure 4.2. First, note that, unlike structural models presented to this point in the book, all but one of the variables in the structural model are observed. Although contingency on appearance, given that it was measured using five items, might be modeled as a latent variable, in this instance, participants were characterized as high or low based on a total score from an earlier data collection effort. Condition is the salience manipulation, and condition within low contingency and condition within high contingency are orthogonal contrasts that model the two-way interaction in a way that provides a direct test of the expected effect of the manipulation for high-contingency individuals. In the model, the observed score on situated focus on appearance (i.e., salience) is positioned as a mediator. As modeled, the commonality among the seven public self-consciousness items can be attributed to two latent variables. Not surprisingly, the public self-consciousness latent variable influences all the indicators. Note, however, that three of the items are influenced by a second latent variable, a subfactor (Rindskopf & Rose, 1988) that
62
04-Hoyle-4154-Ch-04.indd 62
11/01/2011 12:58:02 PM
modification, presentation, and interpretation
Contingency on Appearance
Condition within Low Contingency
Appearance Subfactor
Situated Focus on Appearance
Condition within High Contingency
Public SelfConsciousness
Figure 4.2 Path diagram of mediation model predicting state public self-consciousness from individual differences in contingency of self-esteem on appearance and manipulated salience of appearance (Paths designated by broken lines were fixed at zero. Disturbances and uniquenesses are omitted to improve readability of the figure. Parameters to be estimated are presented in the text)
reflects self-consciousness related specifically to appearance. This latent variable was specified a priori as a means of extracting appearance-based public selfconsciousness from public self-consciousness more generally, the latter being the focus of the hypothesis. In terms of structural paths, the model posits that individual differences in the contingency of self-esteem on appearance influence situated focus on appearance and both forms of state self-consciousness; for state self-consciousness the effect is direct and indirect. The model further posits a direct effect of the manipulation on situated focus on appearance for individuals high in contingency but not for their low-contingency counterparts. It posits no direct effect of the experimental condition on state public self-consciousness. Instead, it specifies an indirect effect through situated focus on appearance for high-contingency individuals. Note that several components included in earlier path diagrams have been omitted. These include disturbances, uniquenesses, and * to indicate free parameters. These are not required to effectively communicate the primary hypotheses tested by the model, and their inclusion would further complicate an already busy figure. The fact that these aspects of the diagram have been omitted and the rationale for doing so are indicated in the figure caption. Having provided the reader with a detailed description of the model to be estimated, including the position of all observed and latent variables in the model, I can now indicate which parameters are to be estimated and, as a result, the degrees of freedom on which tests of fit will be based. The model includes the following free parameters:
63
04-Hoyle-4154-Ch-04.indd 63
11/01/2011 12:58:03 PM
structural equation modeling for social and personality psychology
•• •• •• •• ••
three variances of the exogenous variables and three covariances between them three disturbances of the endogenous variables six loadings on the primary latent variable and two loadings on the subfactor seven uniquenesses six path coefficients
This tallies to 30 free parameters. The observed covariance matrix produced from the p = 11 observed variables comprises p(p + 1)/2, or 66, unique elements. Subtracting the number of free parameters from the number of observed data points yields 36 degrees of freedom.
Details of Analysis Because of the many ways in which a model might be respecified to produce interesting alternative models, it is customary to provide the covariance matrix in the published report. This information coupled with an accurate and sufficiently detailed description of the model would allow interested readers to duplicate the results and estimate alternative models that were not considered or described in the report. An economical means of providing the observed data as well as useful descriptive statistics is to include the correlation matrix along with standard deviations. Most SEM computer programs will generate covariances if given correlation coefficients and standard deviations, or it can be done manually as COVxy = rxy(sxsy).
The correlation coefficients and standard deviations for the 11 observed variables are given in Table 4.1. I have also included the means, which make evident that the exogenous variables were centered. If normality were a concern, I might also have included columns for skewness and kurtosis, though the space needed for such details is rarely available in journals not focused on methodology or statistics. A recommended alternative is a few sentences summarizing evaluations of distributions and any measures (e.g., transformations, parceling, alternative estimators) taken to address serious departures from normality. For the current example, I consulted the normalized estimate of Mardia’s coefficient, which indexes multivariate kurtosis. The value of −0.24 is near zero, indicating no evidence of nonnormality in terms of kurtosis. Univariate tests of skewness indicated no significant departure from symmetry in the distributions of the individual variables. Given these data, use of ML estimation is warranted. It bears noting that I am able to provide this information about the state of the distributions because I have access to case-level data. This is not the case if I estimate from a covariance matrix such as what I might find in a published report of SEM analyses. In such cases, the assumption of multivariate normality goes untested.
64
04-Hoyle-4154-Ch-04.indd 64
11/01/2011 12:58:03 PM
04-Hoyle-4154-Ch-04.indd 65
Table 4.1 Correlation Coefficients and Standard Deviations for Variables Included in the Model Variable
M
SD
1. Contingency on appearance 2. Condition within low contingency
.00
1.00 .69
1 .05
3. Condition within high contingency 4. Situated focus on appearance 5. Public self-consciousness item 1
-.00 .00 2.20 3.20
.72 1.20 1.09
.02 .18* .20**
6. Public self-consciousness item 2 7. Public self-consciousness item 3 8. Public self-consciousness item 4 9. Public self-consciousness item 5 10. Public self-consciousness item 6 11. Public self-consciousness item 7
3.63 3.61 3.66 3.53 3.42 3.72
1.09 1.11 1.06 1.27 1.09 .92
.24** .51*** .26*** .43*** .37*** .33***
Note: N = 161. *p < .05, **p < .01, ***p < .001
2
.00 .17* -.09 .00 .08 .04 .04 .07 .05
3
4
.28*** .04
.04
.12 .11 .11 .09 .14 .15
.21** .24** .13 .24** .25*** .22**
5
6
7
8
9
10
.58*** .28*** .30*** .29*** .34*** .31***
.44*** .51*** .30*** .49*** .46***
.52*** .39*** .56*** .52***
.39*** .63*** .42***
.36*** .51***
.51***
11/01/2011 12:58:03 PM
structural equation modeling for social and personality psychology
In addition to details about the data, a report on SEM analyses should include a clear statement about how model fit will be evaluated complete with justification and criteria. In Chapter 3, I made a case for the comparative fit index (CFI; Bentler, 1990) and the root mean square error of approximation (RMSEA; Steiger & Lind, 1980), which were used to evaluate the fit of the current model. In addition, I recommend reporting but not interpreting the value of the traditional χ2 test. CFI is normed and indexes fit with reference to an independence model. It performs well with relatively small samples such as the current one (N = 161). As recommended by Hu and Bentler (1999), I declare .95 as a minimum value for inferring model fit. I will interpret values between .90 and .95 as indicative of marginal fit, high enough to consider minor modifications in order to achieve acceptable fit. RMSEA indexes fit in absolute terms and includes a penalty for lack of parsimony. Following the recommendations of Browne and Cudeck (1993), I declare values from zero to .08 as indicative of acceptable (i.e., close) fit, and values from .08 to .10 as evidence of marginal fit. These criteria apply to point estimates of RMSEA. I recommend reporting the 90% confidence interval for RMSEA as well. In the current example, given the relatively small N, the confidence interval is likely to be fairly wide, perhaps extending into the marginal or poor-fit range even with a point estimate of .08 or less. In the event of marginal fit, I will examine residuals and free only those parameters that are plausible.
Information Relevant to Omnibus Fit I estimated the model shown in Figure 4.2 using the ML method. Although all parameters in the model are identified, the value of the fitting function had not reached its minimum after 30 iterations (the default maximum), resulting in a failure to converge. I overrode the default limit on number of iterations, raising it to 100, and re-estimated the model. This strategy produced convergence at 38 iterations. According to the chosen fit criteria, the fit of the model was marginal, χ2(36, N = 161) = 74.39, p < .001; CFI = .92; RMSEA = .08 (90% CL = .06, .11). Values in the standardized residual matrix were uniformly small with one exception: the residual correlation between the first two public self-consciousness items. Referring back to Table 4.1, the observed correlation between these two items was r = .58. The standardized residual of .23 indicates that 40% (rresidual/robs) of the correlation between the two items is not explained by the model. Examination of the items indicates that both begin with “I’m concerned about” and refer to self-presentation, a combination of properties they do not share with other items on the scale. In order to account for this unexplained correlation, I modified the model by freeing the covariance between the uniquenesses for the first two public self-consciousness items. Estimation of this model converged more quickly than the original model (21 iterations) and produced fit statistics that indicated acceptable fit according to the chosen criteria, χ2(35, N = 161) = 44.97,
66
04-Hoyle-4154-Ch-04.indd 66
11/01/2011 12:58:03 PM
modification, presentation, and interpretation
p = .12; CFI = .98; RMSEA = .04 (90% CL = .00, .08), including a nonsignificant value of χ2. With one modification, the hypothesized model offers an acceptable account of the observed data. Although the model fits the data well, one might reasonably ask whether the five structural paths fixed to zero in the model can be justified statistically. To address this question, I used the multivariate Lagrange multiplier test described earlier in this chapter. Recall that this automated specification search strategy evaluates the full set of fixed parameters and determines whether freeing a subset would result in improved model fit. The test suggested two fixed parameters that, if freed, would significantly reduce the already nonsignificant χ2 by 12.69 (significant at p = .002 on two degrees of freedom). The parameter that would offer the largest improvement in fit if freed is the covariance between the first public self-consciousness item and individual differences in contingency on appearance. This modification could be accomplished by freeing the path from contingency to the uniqueness of the first item, but there is neither conceptual nor logical justification for such a modification. The second fixed parameter that, if freed, would result in an improvement in fit is the structural path from the contrast between experimental condition within low contingency to situated focus on appearance. Although the addition of this path could be justified on conceptual grounds, it was not predicted a priori and would offer trivial improvement to the fit of a model that already fits the data well. Because the current analysis focuses on a single model to which a single modification was made in order to produce the final model to be estimated, the presentation of fit statistics can be done effectively in the text of the report. When the analysis includes a number of models to be compared and/or a series of modifications, the degrees of freedom and values of the chosen fit statistics can be arrayed in a table as an effective means of both presenting the values and showing how many models were estimated and in what order to produce the model to be interpreted.
Parameter Estimates Model estimation yields an estimate and standard error for every free parameter in the model. The current model, with the freed covariance between uniquenesses, comprises 31 free parameters. The challenge in presenting these results is how to convey this information in its entirety and in a format that does not unduly burden readers. In the case of simpler models, for which path diagrams are less cluttered, parameter estimates can be presented in the place of the * in the diagram and their significance level indicated by superscripts. For more complex models, such as the one considered here, a better alternative is presentation in a table. ML parameter estimates for the current model are provided in Table 4.2. Each row corresponds to a parameter and is labeled with reference to the variable
67
04-Hoyle-4154-Ch-04.indd 67
11/01/2011 12:58:03 PM
structural equation modeling for social and personality psychology
Table 4.2 Maximum Likelihood Estimates and Tests of Free Parameters Parameter
Parameter Estimate
Standard Error
p value
Standardized Estimate
Measurement Model Loadings Item 1 on Public SC1 Item 2 on Public SC Item 3 on Public SC Item 3 on Appearance1 Item 4 on Public SC Item 5 on Public SC Item 5 on Appearance Item 6 on Public SC Item 7 on Public SC Item 7 on Appearance Uniquenesses Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 1 with Item 2
1.52 1.35
.25 .31
< .001 < .001
1.74 .95 1.30 1.89 1.08 .57
.35 .29 .42 .38 .25 .23
< .001 < .01 < .01 < .001 < .001 < .05
.43 .65 .57 .36 .76 .35 .41 .80 .55 .25
.97 .69 .52 .47 1.00 .42 .46 .37
.11 .09 .08 .07 .14 .07 .06 .08
< .001 < .001 < .001 < .001 < .001 < .001 < .001 < .001
.90 .76 .65 .65 .79 .59 .74 .44
.54 .34 .42 .47 .06 .08
.14 .10 .18 .12 .04 .04
< .001 < .001 < .05 < .001 .15 < .05
.68 .36 .18 .28 .19 .19
.02 .01 .00
.03 .03 .04
.55 .84 .99
.05 .02 .00
.25 .48 .53 1.29 .07 .18
.03 .05 .06 .14 .05 .07
< .001 < .001 < .001 < .001 .18 < .05
1.00 1.00 1.00 .94 .68 .90
Structural Model Paths Contingency → Appearance Contingency → Public SC Contingency → Focus Condition/High → Focus Focus → Appearance Focus → Public SC Covariances Contingency with Condition/Low Contingency with Condition/High Condition/Low with Condition/High Variances Contingency Condition/Low Condition/High Focus disturbance Appearance disturbance Public SC disturbance
Note: Labels are abbreviations of labels in Figure 4.2. Parameter fixed at 1.0 to identify variance of latent variable.
1
names in Figure 4.2. The parameters are divided according to whether they are part of the measurement model or the structural model. Within the measurement model, parameters are divided into loadings and uniquenesses. Within the structural
68
04-Hoyle-4154-Ch-04.indd 68
11/01/2011 12:58:03 PM
modification, presentation, and interpretation
model, parameters are divided into paths, covariances, and variances. In the first column are unstandardized parameter estimates. These are estimates that reflect the metrics of the variables involved. For instance, the value of 1.52 for the loading of item 2 on the public self-consciousness latent variable indicates that, for each increase in the latent variable of one unit, there is an increase of 1.52 units on the item. Accompanying each unstandardized estimate is a standard error, given in the second column. The ratio of these two values yields an observed z statistic, for which p-values are provided in the third column. In the last column are standardized estimates, which social and personality researchers typically interpret. For the loadings, these are values in the form typical of factor analysis. For covariances, the standardized estimates are correlation coefficients. And for paths, the coefficients are betas as in multiple regression analysis. Depending on the audience, the table might have been labeled differently or its contents presented in a different form. I could have labeled the rows using doublelabel or LISREL notation. For example, the parameter labeled “Item 2 on public SC” would be v6,F1 in double-label notation or λ12 in LISREL notation. Instead of providing standard errors and p-values for the critical ratio, I could have provided a single column that included the critical ratios flagged for significance using superscripts. Although the description of the results might say little or nothing about the values of the uniquenesses or variances in the structural model, it will no doubt focus on the loadings in the measurement model and the path coefficients in the structural model. Note that all loadings are significant. The loadings on the appearance subfactor are relatively small, suggesting that the effect of this latent variable on its three indicators is relatively weak compared to the effect of the primary latent variable. The path coefficient of greatest interest, from the condition contrast within high contingency to situated focus on appearance, is relatively strong. When coupled with the significant path from focus to state public selfconsciousness, the pattern corresponds well to the predictions.
Interpretation Successful completion of the analysis phase of an implementation of SEM is challenging and rewarding. Yet, as is evident from the section on presentation, successfully navigating the terrain between acquisition of “clean” output from an SEM computer program and a manuscript suitable for publication is deceptively challenging. An effective presentation of the results sets the stage for the final stage of implementation, interpretation. Although the interpretation will, in large measure (and rightly so), focus on the contribution of the results to knowledge on the structures or processes to which it applies, it must also touch on important features of the data and modeling conditions that produced the results. To that end, a goal in interpreting SEM results is the
69
04-Hoyle-4154-Ch-04.indd 69
11/01/2011 12:58:03 PM
structural equation modeling for social and personality psychology
highlighting of characteristics of SEM analyses, in general and in the specific implementation, that affect what can and cannot be said about the structures or processes of interest. In the remainder of this section, I highlight a number of characteristics that should be routinely considered when interpreting results of SEM analyses.
Description of Latent Variables It is not uncommon for reports of SEM analysis to describe latent variables in unqualified terms as manifestations of constructs from which measurement error has been removed. Recall that, when, as is virtually always the case in social and personality research, indicators of a latent variable are reflective, the latent variable is a function of the commonality across the full set of indicators. As such, if all indicators of a latent variable are subject to the same source of measurement error, the latent variable, in fact, is not free of the influence of that source of error. For instance, if error attributable to self-reports is a concern but all indicators are operationally defined as self-reports, that error is reflected in the latent variable rather than the uniquenesses (i.e., measurement errors). Only the influence of those sources of error that vary across indicators, as in multitrait–multimethod models or collateral reports, is removed from latent variables (DeShon, 1998). For this reason, the correct interpretation of latent variables with reference to measurement error is as manifestations of constructs from which sources of error that vary across the indicators have been removed.
Causal Language It is perhaps unfortunate that early forms of SEM were referred to as causal modeling (e.g., Bentler, 1980; Blalock, 1971). Labeled in this way, many early adopters of SEM methodology were motivated by an erroneous belief that SEM offered a means of testing causal hypothesis using nonexperimental data. Although SEM provides certain advantages over other statistical procedures for modeling putatively causal effects, even the most powerful statistical procedures cannot overcome the limitations of certain research designs. The traditional view of causality specifies three basic requirements for inferring causality: (1) the cause and effect must be related; (2) alternative explanations for the observed relation must be ruled out by isolating the putative cause from alternative causes; and (3) the cause must precede the effect in time. SEM offers a certain advantage regarding the demonstration of a relation because of the capacity to model constructs as latent variables, thereby producing disattenuated estimates of association. SEM has less to offer in terms of the isolation criterion. When research participants self-select to levels of a putative cause, they bring with them many additional characteristics that covary with the cause. Thus, there is an inherent
70
04-Hoyle-4154-Ch-04.indd 70
11/01/2011 12:58:03 PM
modification, presentation, and interpretation
ambiguity in the interpretation of the cause–effect relation. From a design perspective, the putative cause can be disentangled from extraneous variables through random assignment to levels of a clean manipulation of the causal variable. In the absence of random assignment and manipulation, the isolation concern must be addressed using statistical methods. This strategy is of limited value, however, because it is not possible to determine when one has isolated a putative cause from all possible confounding variables. SEM offers certain advantages over traditional statistical approaches in this regard; however, it cannot fully address the isolation problem. Finally, there is the concern of temporal precedence, or directionality. As with isolation, the definitive strategy is to manipulate the putative cause, thereby ensuring that the only plausible direction for the cause–effect relation is from the cause to the effect. An alternative strategy is to introduce time into the design as in panel designs. Although these designs strengthen the case for a directional effect, they are not conclusive. SEM is particularly well suited for use with longitudinal data; however, as far as the inference of directionality goes, it cannot overcome the limits of the research design. Despite the advantages SEM affords over statistical methods typically used in social and personality research, it offers only modest improvement over these methods for inferring causality according to traditional criteria. An emerging view of causal inference provides a stronger basis for inferring causality from SEM results. This view focuses on inference regarding the model as a whole (as opposed to specific paths), emphasizing the importance of paths fixed at zero, which are not a feature of traditional statistical methods such as ANOVA and multiple regression analysis. Causal, as opposed to statistical, analysis focuses on the probability of the model given certain conditions (e.g., its basis in theory, the inclusion of potential confounding variables) and alternatives (Pearl, 2009; for an introduction, see Pearl, 2010). This view of causality is profoundly different from the traditional view, with its focus on research design, sampling, and statistical analysis, and, if adopted, would substantially strengthen inferences of causality from SEM analyses.
Suitability of Sample Size A criticism that sometimes is leveled at applications of SEM is that the data are from samples too small to meet the large-sample assumption on which most estimators and tests are based. Although it is not entirely clear what constitutes a large sample, it is abundantly clear that samples typical of research in social and personality psychology are relatively small. Indeed, samples may routinely be too small to support valid estimation and the resulting tests of model fit. The practical question for social and personality researchers is what minimum number of observations is necessary for valid estimation. Although there are qualifying factors, and the number is somewhat variable as a function of the particular outcome in
71
04-Hoyle-4154-Ch-04.indd 71
11/01/2011 12:58:03 PM
structural equation modeling for social and personality psychology
question (e.g., parameter estimates, fit indices), simulation studies point to about 400 as the number of observations at which the outcomes of ML estimation correspond to expectation (e.g., Bentler, 1990). The stability of parameter estimates is questionable in all but the simplest models (i.e., fewer than 10 variables) with fewer than 200 observations (Loehlin, 1992), though, as illustrated in this chapter, SEM with fewer observations is feasible with normally distributed data and a well-fitting model. This number increases as the distributions of the variables depart from normality and as models become more complex. The minimum number is substantially larger for estimators that do not assume normality and/or continuous measurement. For these reasons, results from SEM analyses based on the smaller samples typical of research in social and personality psychology must be interpreted with caution, including acknowledgment that the findings are only suggestive until replicated using data from suitably large samples.
Modification and Inference In the first part of the chapter, I highlighted the potential for errors of inference when using specification searching (manual or automated) to modify a model. Because decisions about model modification are made by consulting results of initial analyses of the data, the likelihood of Type I error is unacceptably high, and therefore fit statistics and indices cannot be taken at face value. Thus, although investigators might be tempted to interpret the results from estimation of modified models that produce acceptable values of fit indices as if they were specified a priori, they should instead interpret them as suggesting hypotheses to be tested using another set of data from the population. Only then can inferences about the population from which the sample was drawn be made.
Equivalent Models Whether the results that are being interpreted are for a model specified a priori or one based on post hoc modifications, care must be taken to acknowledge the presence of other models that, on statistical grounds, are equivalent. The issue of equivalent models would be troubling enough if, for a given model specification, several equivalent alternatives could be generated; however, research design and logical considerations aside, an infinite number of equivalent models could be generated, each producing the same implied covariance matrix and, therefore, the same fit statistics (Raykov & Marcoulides, 2001). Many such models, though defensible on statistical grounds, are implausible or uninteresting. A potentially large number, however, suggest plausible alternatives that are both statistically indistinguishable and conceptually incompatible with the favored model (MacCallum, Wegener, Uchino, & Fabrigar, 1993). This concern is illustrated in Figure 4.3. Imagine that the model depicted in the path diagram at the top of the
72
04-Hoyle-4154-Ch-04.indd 72
11/01/2011 12:58:03 PM
modification, presentation, and interpretation
* *
d3 *
v1
* d2
*
v3
*
v2 *
*
v1
*
*
d3 *
v1
d3 v3
* *
d1
* *
*
*
v3 *
v2
v2 Figure 4.3 Three equivalent models
figure, one in which the effect of v1 on v3 is partially direct and partially indirect through v2, is the favored model. Two models that are statistically indistinguishable from the favored model are shown in the lower portion of the figure. In the model to the left, v2 is an independent variable rather than a mediator, and the effect of v1 on v3 is only direct. In the model to the right, the direction of the relation between v1 and v2 has been reversed, so that v2 is now the independent variable and v1 the mediator. The discussion of any implementation of SEM should explicitly state that the favored model, though consistent with the observed data and, presumably, well-justified predictions, is not unique in its ability to account for the data. Such discussion is strengthened when some of the more plausible or important alternatives are derived and presented as context for interpreting the favored model. A number of straightforward rules can be used to derive equivalent structural models (Lee & Hershberger, 1990; Stelzl, 1986). These generally involve replacing a nondirectional path with a directional path or reversing the direction of path. Acknowledging the presence of these models, though it weakens the case for the favored model as the “correct” model, provides a better indication of how serious the results should be taken and the potential direction of future research on relations in the model. It bears mention that the exercise of deriving equivalent models can be undertaken before data are collected (Hershberger, 1994). The advantage to generating equivalent models prior to data collection is that one or more alternatives might suggest alterations to the research design that serve to rule them out.
73
04-Hoyle-4154-Ch-04.indd 73
11/01/2011 12:58:04 PM
structural equation modeling for social and personality psychology
Putting the Implementation Framework to Work I have now presented in general terms the different phases of the implementation framework for using SEM in social and personality research. To recap, the process begins with specification, with special attention to the identification status of any models to be estimated. Specification is followed by estimation, the process by which values of the free parameters are generated from the data given the model specification. Estimation yields values that can be used to build fit statistics and indices, which are used to evaluate the degree to which a model provides an acceptable account of the data. If, by pre-established criteria, the model fits the data, implementation moves to presentation and interpretation as outlined in this chapter. If the originally specified model does not fit the data, then it may be modified based on a specification search. Although the results from estimation of a modified model are presented in a manner similar to that for results for an a priori model, interpretation for the modified model must be appropriately qualified given concerns about capitalization on chance. Equipped with the “how to” of SEM, you now are in a position to consider the various ways in which SEM might be used to model processes of interest to social and personality psychologists. The next chapter reviews the array of modeling possibilities with a particular focus on models for which the statistical methods traditionally used by researchers in social and personality psychology are ill-suited.
74
04-Hoyle-4154-Ch-04.indd 74
11/01/2011 12:58:04 PM
5
Modeling Possibilities and Practicalities
Equipped with an implementation framework for using SEM, you are now in a position to consider the various types of models one might specify and the research questions they could be used to address. These range from relatively straightforward models, such as confirmatory factor analysis and path analysis, to complex models that include multiple waves and levels of data. My goal in this chapter is not to provide a detailed explication of how each model is specified. Rather, it is to describe the models in such a way that it will be evident to readers which models hold promise for their own research and what type of data would be necessary to estimate those models. For each model, I cite sources that offer more detailed information about the model. For many of the models, I also touch on practical considerations involved in specification, estimation, and interpretation. A general modeling possibility that applies to any of the models presented in the chapter warrants mention at the outset. Using SEM methods, one can simultaneously fit a model to data from two or more samples (Sörbom, 1974), a strategy referred to as multigroup modeling. In so doing, one can compare values of subsets of parameters, or all parameters, across groups. Such comparisons are, in effect, tests of statistical interaction because they address the question of whether values of the parameters are consistent across levels of a variable, typically referred to as a moderator variable. The comparisons are done through tests of the tenability of between-group equality constraints on free parameters, a strategy described in the section on measurement invariance later in the chapter. A significant advantage of comparing effects between groups using multigroup modeling is that the effects of interest can be estimated between latent variables. By using latent variables to correct estimates of the effects for attenuation due to unreliability, extraneous between-group differences in an effect due to differential reliability of measurement are ruled out (illustrated later in the chapter with crosslagged panel models). This general strategy can be applied to any model for which sufficient data have been collected for the groups to be compared. Thus, although multigroup modeling is not addressed with reference to each model described in the remainder of the chapter, it is a modeling possibility for any of them.
05-Hoyle-4154-Ch-05.indd 75
11/01/2011 12:58:32 PM
structural equation modeling for social and personality psychology
Measurement Models Although measurement models are often a subcomponent of models in which the primary interest is structural paths, they may be of interest in their own right; that is, research questions may concern the definition or nature of latent variables rather than the causal relations between them. Implementations of SEM that include only the measurement model are collectively referred to as confirmatory factor analysis (Brown, 2006). In this section of the chapter, I review and illustrate a number of measurement models of potential value for addressing research questions typical of social and personality psychology.
First-Order Factors The first-order factor is the building block of measurement models. As used here, first order means that the relations between indicators and latent variable are direct. As noted in Chapter 2, the preponderance of latent variables in social and personality psychology is assumed to explain some portion of the variance in their indicators. For this reason, in most instances directional paths run from the latent variable to the indicators. For latent variables of this type, the indicators are termed reflective, or effect, indicators because they are a reflection of the latent variable and, in relation to the latent variable, an effect. The indicators are assumed to be, in effect, fallible replicates. As such, all, or most, of their correlations with each other are assumed to be attributable to the latent variable. Evaluations of the quality of indicators in such models typically follow from classical test theory assumptions (e.g., a portion of the variance in items is measurement error, internal consistency should be high). A simple first-order latent variable with reflective indicators is shown on the left in Figure 5.1. On the right in Figure 5.1 is a latent variable with formative, or causal, indicators. Note that the direction of the paths has been reversed so that they now run from the indicators to the latent variable. Note also that there are no uniquenesses associated with the indicators and the variance of F1 has been replaced by a disturbance term. Each indicator is assumed to represent a somewhat unique component of the latent variable. Two ideas follow from that assumption. First, there is no assumption about the correlations between the indicators. Indeed, the lower the correlation between them, the more unique information each one brings to the latent variable. For this reason, classical test theory assumptions do not apply (Bollen & Lennox, 1991). Second, the set of indicators is assumed to fully define the latent variable. In this way, the rationale for selecting indicators differs substantially for formative and reflective indicators. Because reflective indicators are replicates, removing one, in theory, does not change the meaning of the latent variable. Formative indicators, on the other hand, are separate components of the latent variable; therefore, removing one changes the meaning of the latent variable.
76
05-Hoyle-4154-Ch-05.indd 76
11/01/2011 12:58:32 PM
modeling possibilities and practicalities
* *
d1 F1
F1
1
*
*
v1
v2
v3
e1
e2
e3
*
*
*
1
*
v1
v2 *
*
*
v3 *
Figure 5.1 First-order factors with reflective (left) and formative (right) indicators
Examples of formative indicators are rare in social and personality psychology due in large measure to the fact that indicators typically are behaviors assumed to reflect the construct of interest (e.g., response latency as an indicator of accessibility, noise blast as an indicator of aggression) or questionnaire items written with the assumption that responses reflect the construct. An example of formative indicators with which many readers would be familiar is measures of life stress that assess the occurrence of a set of life events assumed to cause stress. The occurrence of each event is assumed to result in an increase in life stress and failure to assess any life event that produces stress would result in an incomplete measure. Moreover, there is no reason to believe that the occurrence of different life events would be correlated, a requirement if they are to be modeled as fallible reflections of an underlying construct. Although one might question whether variables that are, in effect, weighted composites of a set of observed variables are latent variables at all (e.g., MacCallum & Browne, 1993), they meet certain minimal definitions of a latent variable (e.g., Bollen, 2002). When modeling of latent variables with formative indicators is warranted, a significant concern is identification. It can readily be shown that the model to the right in Figure 5.1 is not identified. This concern has been addressed by the development of rules for identifying parameters in models with formative indicators. As an example, a latent variable with formative indicators must have a direct influence on at least two variables in the model (Bollen & Davis, 2009). Apart from identification, this specification has the appeal of producing a set of estimates for the formative indicators that yield a composite that is optimized for predicting its outcomes. Returning now to the general discussion of first-order factors and restricting the discussion to models with reflective indicators, we can consider modeling
77
05-Hoyle-4154-Ch-05.indd 77
11/01/2011 12:58:32 PM
structural equation modeling for social and personality psychology
* *
*
*
v1
v2
e1 *
*
F2
F1
1
*
*
*
1
*
v3
v4
v5
e2
e3
e4
*
*
*
F3
*
1
*
*
v6
v7
v8
v9
e5
e6
e7
e8
e9
*
*
*
*
*
Figure 5.2 Three-factor model with simple structure and correlated factors
possibilities when a set of indicators is assumed to imply more than one factor. A path diagram depicting a three-factor model of nine indicators is shown in Figure 5.2. A noteworthy feature of this specification is the fact that each indicator loads on only one of the three factors; that is, its loadings on the other two factors have been fixed to zero. This specification corresponds to simple structure, the target of many rotation strategies in exploratory factor analysis (Thurstone, 1954). Note also that the covariances between the three factors are free, corresponding to an oblique rotation. Finally, note that the covariances between uniquenesses are fixed at zero, corresponding to an assumption in exploratory factor analysis. For these reasons, the model reflects a number of hypotheses about the factor structure underlying the pattern of covariances between the indicators. A number of alternative models might be specified as a means of testing hypotheses reflected in the model. Focusing first on the relations between the latent variables, the set of three covariances between them might be fixed to zero as a means of determining whether an orthogonal model accounts better for the data than the oblique model specified here. At the other extreme, the paths might be fixed at values reflecting perfect correlation as a means of testing whether three factors are required or whether one would suffice. It should be noted that, as specified, the paths between latent variables are covariances. As such, the specific values that reflect perfect correlation are difficult to determine. In an alternative, and equivalent, specification, I can free the three loadings currently fixed at 1.0, and fix the variances of the latent variables to 1.0. This, in effect, standardizes the latent variables, which renders the paths between them correlations. Within this specification, I can fix the paths to 1.0 as a means of modeling perfect correlation,
78
05-Hoyle-4154-Ch-05.indd 78
11/01/2011 12:58:33 PM
modeling possibilities and practicalities
or a one-factor model. Note that both of these alternatives, the orthogonal model and the one-factor model, are nested in the model shown in the figure. As such, they can be statistically compared to that model using a χ2 difference test on three degrees of freedom. Although I might have no a priori reason to consider alternative specifications of the loadings and uniquenesses, poor fit of the specified model might point to the need to reconsider the subset of these parameters fixed to zero. For instance, the assumption of simple structure might not be tenable, in which case cross-loadings need to be added to the model. Alternatively, there might be commonality in pairs of uniquenesses that is not explained by the latent variables, in which case the corresponding covariances between uniquenesses need to be freed. Such modifications must take into account identification and, when made on the basis of specification searching, must be presented with appropriate qualification. It is sometimes the case that the fit of a model is sufficiently poor that relatively minor modifications such as adding double loadings or freeing covariances between uniqueness are not sufficient to produce acceptable fit to the data. For instance, looking to the pattern of loadings as a potential source of misspecification assumes, more fundamentally, that the number of factors is correct. Referring back to Figure 5.2, the specified model assumes three latent variables, and any specification searching is with that number held constant. In early evaluations of item sets or for established item sets for which the number of factors is uncertain, an exploratory form of confirmatory factor analysis unconfounds the number of factors and the pattern of loadings as potential sources of misspecification (Hoyle & Duvall, 2004). The unrestricted factor model derives its name from the fact that it specifies all indicators loading on all factors with only those restrictions necessary to identify the model. Because virtually all loadings are free, any misspecification (correlated uniquenesses aside) can be attributed to the incorrect number of factors. Models with differing numbers of factors can be evaluated with reference to the standard criteria for model fit or in comparison to each other. Once the correct number of factors has been determined, either a manual search or a multivariate automated search such as the Wald test can be used to identify loadings that can be fixed to zero, thereby introducing the restrictions typical of measurement models in SEM. As with any exploratory analysis, cross-validation using data from an independent sample from the same population is necessary before firm inferences about the model can be drawn. A general issue to be considered when estimating any measurement model is the scale on which indicators were measured. I noted in Chapter 3 that ML estimation assumes continuous measurement. Although there are instances of continuous measurement in social and personality research (e.g., continuous physiological monitoring, magnitude estimation), most measurement is by a set of response options that are, at best, interval and, more likely, ordinal scaled. The ML estimator
79
05-Hoyle-4154-Ch-05.indd 79
11/01/2011 12:58:33 PM
structural equation modeling for social and personality psychology
is reasonably robust to violations of this assumption at the level typical of social and personality research (i.e., interval-appearing response scales with five or more alternatives). When the level of measurement is too coarse for ML estimation, models should be estimated by robust weighted least squares (Muthén, 1983), which, for most SEM computer programs, requires converting covariances to polychoric correlations. The conversion to polychoric correlations assumes that underlying the ordinal variables are continuous and normally distributed variables (Flora & Curran, 2004).
Higher-Order Factors In a multifactor model with at least three factors, a second-order factor (i.e., one twice removed from the indicators) can be specified and estimated. (At least three second-order factors permit a third-order factor, and so on.) A respecification of the three-factor model discussed earlier to include a second-order factor is shown in Figure 5.3. First, note that the covariances between the factors are gone. This is because, just as the first-order factors explain the commonality between their indicators, the second-order factor explains the commonality between its indicators, which are the first-order factors. Associated with the first-order factors are disturbances, which are the equivalent of uniquenesses. The variance of the second-order factor is estimated and its variance identified by fixing the loading of F1 on F4. As discussed in Chapter 2, three indicators are the minimum necessary to produce an identified model with reflective indicators. To further develop this point, imagine a covariance matrix based on the latent variables F1, F2, and F3, which would comprise six unique elements. The free parameters in the secondorder portion of the model are two loadings, three uniquenesses, and one variance – a total of six. As such, this portion of the model is just identified. It also is equivalent to the model shown in Figure 5.2, which distributes the degrees of freedom at the factor level as three variances and three covariances. For this reason, the two models cannot be distinguished on the basis of model fit. Their estimation would yield identical implied covariance matrices and, therefore, identical fit statistics. The comparison of first- and second-order factor models requires at least four first-order latent variables. Second- and higher-order factor models allow for compelling tests of models in which constructs manifest at various levels of abstraction (e.g., specific v. general intelligence, domain-specific v. global self-esteem). Following the logic of specific effects described in Chapter 4, they also allow for estimation of the effects of a construct at both the specific and general levels of abstraction. Returning to the figure, d1 comprises variance in F1 not attributable to F4; that is, variance it does not share with the other first-order factors. As such, it affords the opportunity of examining the effect of the construct at its most general and abstract (F4), as well as in its more specific forms (e.g., d1).
80
05-Hoyle-4154-Ch-05.indd 80
11/01/2011 12:58:33 PM
modeling possibilities and practicalities
*
F4 1
*
*
*
*
*
v1
v2
e1 *
d3
F2
F1
1
*
d2
d1
*
1
*
v3
v4
v5
e2
e3
e4
*
*
*
F3
*
1
*
*
v6
v7
v8
v9
e5
e6
e7
e8
e9
*
*
*
*
*
Figure 5.3 Second-order factor model
Models with Subfactors The example used to illustrate the presentation of SEM results in Chapter 4 included seven indicators presumed to reflect two latent variables (see Figure 4.2). Unlike the multifactor models reviewed so far in this chapter, however, the model did not evidence simple structure. Recall that the indicators were items comprising the public subscale from the Self-Consciousness Scale. All seven were specified as reflective indicators of public self-consciousness, and a subset of three was specified as also reflecting a second factor, appearance-related selfconsciousness. Put differently, the model assumes that three of the items share a source of commonality that is not shared with the remaining items and, therefore, not reflected in the public self-consciousness factor. In that model, appearancerelated self-consciousness is a subfactor (Rindskopf & Rose, 1988). I might have instead accounted for this secondary source of commonality by freeing the covariances between the uniquenesses associated with these three items, an observation that shows the relation between subfactors and correlated uniquenesses. Although the presence of subfactors in measurement models is not easily detected by specification searching, patterns of suggested modifications may point to a single source of unexplained covariance between three or more indicators. Subfactors might also be posited on the basis of knowledge about the indicators, as
81
05-Hoyle-4154-Ch-05.indd 81
11/01/2011 12:58:33 PM
structural equation modeling for social and personality psychology
was the case with the appearance-related self-consciousness subfactor in our model of public self-consciousness. Although that subfactor was substantive in nature, subfactors often reflect method factors that affect variance in some indicators and not others. For example, reverse-scored items on a scale often share commonality they do not share with items not reverse scored. Any such method factor can be modeled as a subfactor if it does not influence all of the indicators. As illustrated in the public self-consciousness example, estimation can be challenging. (Recall that 38 iterations were required in order to obtain convergence.) In the remainder of this section, I highlight two models with subfactors that are specified a priori. One, the multitrait–multimethod model, is familiar to most researchers in social and personality psychology. The other, the trait–state–error model, though particularly promising for personality research, is not well known and therefore is rarely used. Each models variability in each of a set of indicators as emanating from two latent sources by positing factors to explain what otherwise would have been allocated to measurement error (i.e., uniqueness). The prototypical multitrait–multimethod (MTMM) model is shown in path diagram form in Figure 5.4. In the model, nine indicators are influenced by three latent traits, T1, T2, and T3, and three latent method factors, M1, M2, and M3.
T1
T2
T3
t1m1 t1m2 t1m3 t2m1 t2m2 t2m3 t3m1 t3m2 t3m3
M1
M2
M3
Figure 5.4 Multitrait–multimethod model (Uniquenesses and parameter are omitted to improve readability)
82
05-Hoyle-4154-Ch-05.indd 82
11/01/2011 12:58:34 PM
modeling possibilities and practicalities
Each indicator also is influenced by a uniqueness term (not shown). Thus, for example, the measurement equation for the first indicator would be written as t1m1 = *T1 + *M1 + e11..
A number of parameter estimates in this model might be of interest to the researcher. For instance, tests of the variances of the latent method factors would indicate whether those factors are evident in the indicators. If there is evidence favoring the posited method factors, then equality constraints might be used to compare the relative contribution of trait and method to variance in each indicator. Because the loadings of the indicators on the traits are free from the influence of the modeled methods, covariances between the traits cannot be attributed to those methods. Moreover, covariances or directional paths between the traits and other substantive latent variables in the model could not be attributed to method variance in the traits. Finally, other variables could be specified as predictors of the method factors for the purpose of detecting biases attributable to those methods. Although this specification of the MTMM model maps well onto the original characterization of MTMM matrices (Campbell & Fiske, 1959), it does not fare well in estimation attempts. Under conditions that are relatively common, the model is empirically underidentified. Even when those conditions are not present, estimation often results in Heywood cases and failures to converge (Kenny & Kashy, 1992). A parameterization that loses some of the appeal of the prototypic specification but generally fares well in estimation attempts is the correlated trait, correlated uniquenesses model (Marsh, Byrne, & Craven, 1992). Consistent with my earlier reference to subfactors as related to correlated uniquenesses, method factors in this model are specified as sets of correlated uniquenesses. For example, returning to Figure 5.4, M1 would be replaced by covariances between the uniquenesses for t1m1, t2m1, and t3m1. M2 and M3 would be modeled similarly. When modeled in this way, the test for the presence of significant methods influence would involve comparing nested models in which the covariances between uniquenesses are fixed to zero (no method influence) or free. The advantages for estimation are offset somewhat by the absence of latent method factors, which in the original parameterization could be related to other variables. Another measurement model with subfactors is the trait–state–error model (Kenny & Zautra, 1995). The univariate trait–state–error model, shown as a path diagram in Figure 5.5, decomposes variance in a construct measured on four or more occasions into three components. The trait component is that part that does not change over time – the autoregressive component in panel and time-series designs. The state component is that portion of the variance that is reliable but variable over time. The error component is that portion of variance that is not reliable over time. This decomposition can be used to address questions concerning the degree to which a characteristic, as measured, manifests as a trait. Relatedly, it
83
05-Hoyle-4154-Ch-05.indd 83
11/01/2011 12:58:34 PM
structural equation modeling for social and personality psychology
T
vt2
vt1 et1 S1
vt3 et2
S2
vt4
S3 d2
et4
et3 S4 d3
d4
Figure 5.5 Trait–state–error model (Variances on uniquenesses and disturbances are omitted to improve readability)
can partition the state manifestation of the characteristic from its stable form and error as a means of determining whether the variable is subject to change over time or context. In the univariate case, the trait–state–error model focuses on a single construct with the goal of characterizing the construct in terms of its temporal or cross-situational consistency. In the bivariate form of the model, the goal of decomposing variance in the construct is the study of the antecedents or consequences of the state component of the construct in relation to the state component of a second, similarly decomposed, construct. Because each construct is measured on multiple occasions, the focus is the lagged effects of each construct on the other as in cross-lagged panel models (described below). As with the MTMM model, the trait–state–error model can be challenging to estimate. Following the logic of the correlated trait, correlated uniqueness form of the MTMM model, adjustments to the specification improve the likelihood of convergence to a proper solution (e.g., Cole, Martin, & Steiger, 2005).
Measurement Invariance Among the most overlooked measurement hypotheses in social and personality psychology are those concerning the invariance of a construct across groups or time. The question of measurement invariance concerns the degree to which a construct as represented by a set of indicators has the same meaning for different groups or for a single group at different points in time. The issue of measurement invariance is a profound one because the comparison of scores
84
05-Hoyle-4154-Ch-05.indd 84
11/01/2011 12:58:34 PM
modeling possibilities and practicalities
*
1
F1a *
*
*
+
+
M1a
c
M2a
+ +
*
1
1
F2a
F1b
*
*
+
*
*
+
*
F2b *
+
1
+
M1b
c
M2b
+ +
*
*
+
*
+
v1a
v2a
v3a
v4a
v5a
v6a
v1b
v2b
v3b
v4b
v5b
v6b
e1a
e2a
e3a
e4a
e5a
e6a
e1b
e2b
e3b
e4b
e5b
e6b
*
*
*
*
*
*
*
*
*
*
*
*
Figure 5.6 Path diagram showing hypothetical factor structure and parameters to be tested for equivalence across groups or time
between groups or over time when those scores represent qualitatively different constructs is not valid. The various degrees of measurement invariance can be described with reference to a hypothetical measurement model such as the two-factor model depicted in path diagram form in Figure 5.6. In the path diagram, v1-v6 are indicators, measured separately for groups a and b. According to the model, commonality between the indicators can be attributed to two latent variables, F1 and F2. The typical measurement model parameters are indicated by the *. Notice that I have fixed the variance of the latent variables rather than a loading on each latent variable to identify the model. I have done that here because I want to compare all of the loadings between groups, and this comparison could not be done for any fixed loadings. The path diagram includes a shape and several paths I have not included in path diagrams to this point. The triangle in the center of each model indicates a constant, c, which is directed toward each of the indicators and the two latent variables. Assuming I have provided means or am using case-level data as input, inclusion of this component introduces intercepts into the measurement equations, allowing for the calculation of means for the latent variables. To illustrate, the measurement equation for the first indicator for Group A is v1a = c + *F1a + e1a.
To distinguish intercepts from other parameters in the model, I have designated them using pluses, +. (Although all of the intercepts are designated as free in the diagram, I would need to fix one for each latent variable in order to estimate
85
05-Hoyle-4154-Ch-05.indd 85
11/01/2011 12:58:35 PM
structural equation modeling for social and personality psychology
the latent means without producing an underidentified model.) Coefficients associated with the two short paths from the triangles to the latent variables are their estimated, or structured, means. Importantly, these or any other free parameter in the model can be tested for equivalence between the groups (i.e., invariance). This test is accomplished by constraining sets of parameters to equality across groups. To the extent that a model in which the constraints are present does not offer a statistically poorer account of the data than a model in which the constraints are absent, the affected parameters are assumed to be invariant across levels of the characteristic that distinguishes the two groups (e.g., gender, ethnicity, age). Measurement invariance typically is evaluated along a continuum that ranges from no equivalent parameters across groups or time to equivalence of all free parameters in the model. Complete invariance is neither likely nor required in order to conduct meaningful mean comparisons between groups or over time on the constructs. Rather, there must be enough invariance to support the assumption that the indicators reflect the same construct for the groups or time periods being compared. Evaluations of measurement invariance typically followed an ordered progression, testing and drawing conclusions about one set of parameters before moving to the next (Bollen & Hoyle, 1990; Widaman & Reise, 1997). The end result is a determination of the degree of measurement invariance across the groups or periods of time being compared. The most basic level of invariance, configural invariance, is invariance of form (Jöreskog, 1971). It concerns the number of sources of commonality evident in a set of indicators and the pattern of association between the indicators and latent variables (i.e., zero and nonzero loadings). In Figure 5.6, this hypothesis is reflected in the specification of two F with indicators v1-v3 loading on F1 (and not F2), and indicators v4-v6 loading on F2 (and not F1) for both a and b. Absent configural invariance, evaluations of other aspects of invariance are not warranted. Weak metric invariance assumes that factor loadings are equal across groups or time. That is, each of the six loadings is equivalent for the two groups or periods of time. Strong metric invariance adds the restriction of equality of intercepts, the + in Figure 5.6. Evidence of strong metric invariance ensures valid comparisons of both means and variances on the latent variables (Widaman & Reise, 1997). Strict metric invariance assumes identical reliabilities across groups, tested by constraining item uniquenesses to be equal across groups or time. Invariance of the full set of loadings and intercepts for indicators of a latent variable is unlikely, particularly when the number of indicators is large. In such cases, the set of indicators must evidence at least partial invariance in order for valid comparisons across groups or time on the latent variables (Byrne, Shavelson, & Muthén, 1989). Minimally, the scale of the latent variable must be defined equally across groups or time points. Such partial invariance is established when the loading and intercept of one indicator are fixed to identify the latent variable, and equality constraints are tenable for the loading and intercept of at least one
86
05-Hoyle-4154-Ch-05.indd 86
11/01/2011 12:58:35 PM
modeling possibilities and practicalities
other indicator. In such cases of minimal metric invariance, comparisons should be between structured means in the context of a measurement model (e.g., M1a and M1b, in Figure 5.6) as opposed to raw score means (Salzberger, Sinkovics, & Schlegelmilch, 1999). When describing the comparisons that might be made in studies of measurement invariance, I have referred both to comparisons between groups and comparisons over time. Although the question of invariance between groups typically motivates analyses such as those described here, the question of invariance within a group across the period of time covered by a longitudinal study could be addressed as well. This form of invariance may be particularly important in longitudinal studies that span a period of time during which developmental change in the construct of interest might be expected. From a practical perspective, the fundamental difference between studies of these two types of measurement invariance using SEM is the way in which the data are configured and the equality constraints imposed. When the concern is invariance between groups, the model is estimated simultaneously from two or more data sets and equality constraints imposed with explicit reference to the grouping variable. When the concern is invariance over time, the model is estimated from a single data set and includes repeated instances of the measurement model. Equality constraints are imposed between corresponding parameters in the model at different points in time. In either case, if structured means are to be estimated and compared, either the means need to be provided with the covariance matrix (a pairing sometimes referred to as the augmented moment matrix) or the model estimated from caselevel data.
Factor Mixture Models In the typical evaluation of measurement invariance across groups, the group membership variable is observed. Imagine, however, that a researcher has reason to believe that one or more parameters in a measurement model are not invariant throughout a population but has no a priori basis for assigning members of the population to groups for which the values of those parameters might vary. In order to address the question of which parameters differ and for whom, the researcher must first sort the population into homogeneous subpopulations based on an unobserved grouping variable, then compare parameters between samples representing the subpopulations. These two activities are accomplished simultaneously in factor mixture modeling. Factor mixture modeling is perhaps best understood in terms of its components (Lubke & Muthén, 2005). At the heart of the model is a measurement model based on reflective indicators. Because mean differences between groups often are of interest, the measurement model may include intercepts and means as illustrated earlier. As typically modeled, there is an implicit assumption that the estimated
87
05-Hoyle-4154-Ch-05.indd 87
11/01/2011 12:58:35 PM
structural equation modeling for social and personality psychology
model applies to all members of the population; that is, there is only one group. In factor mixture modeling, this assumption is tested by evaluating the degree of heterogeneity in the population evident in the observed data using latent class (for binary variables) or latent profile (for continuous variables) analysis. This component of the model generates a categorical latent variable whose number of levels equals the number of subpopulations, or classes, evident in the data. Individual cases are assigned probabilities that indicate their likelihood of belonging to each class given their data. The latent class variable is then used to evaluate variability on parameters in the measurement model. Additionally, variables, observed or latent, outside the measurement model can be specified as predictors of the latent class variable in order to explore empirically the substantive meaning of the distinction it reflects. A challenge in the use of factor mixture modeling is the determination of the optimal number of classes, or levels, of the latent class variable to retain (Nylund, Asparouhov, & Muthén, 2007). Not unlike the activity familiar to many social and personality researchers of deciding how many factors to extract in exploratory factor analysis, the recommended strategy is to consult statistical criteria but with reference to substantive criteria associated with the variables in the model and general knowledge about the population. The goal of factor mixture modeling often is to predict mean differences on the continuous latent variables as a function of the categorical latent class variable. As with any mean comparison, however, that test assumes a certain level of invariance of the measurement model that gives rise to the means. In factor mixture modeling measurement invariance is tested by comparing sets of parameters as a function of membership in latent class using nested model comparisons as described earlier. If an appropriate level of invariance is evident, valid comparisons of latent means can be made between levels of the latent class variable.
Latent Growth Models As shown in the description of evaluations of measurement invariance and factor mixture modeling, the inclusion of observed means in the data opens up additional modeling possibilities. In those models, the focus typically is structured means; that is, means of the latent variables given the parameters in the measurement model. Estimation of these means is made possible by the addition of intercept terms to the measurement equations. The inclusion of the observed means also allows for the specification of models focused on modeling the pattern of those means. Of particular interest is the situation in which means are available for the same variables on repeated occasions across a period of time during which values on the variables might be expected to change incrementally. Latent growth modeling is an implementation of SEM aimed at modeling and explaining variability in trajectories of change (Willett & Sayer, 1994).
88
05-Hoyle-4154-Ch-05.indd 88
11/01/2011 12:58:35 PM
modeling possibilities and practicalities
Intercept
Linear c
1
1
1
1
0
1
2
3
vt1
vt2
vt3
vt4
et1
et2
et3
et4
Figure 5.7 Latent growth modeling estimating variability in intercepts and slope of linear trajectories (Only fixed parameters are shown)
A simple latent growth model is shown in Figure 5.7. An observed variable, v, has been assessed on four occasions. Each observation is influenced by two latent variables, Intercept and Linear. All loadings are fixed and means are estimated for the latent variables. This is a multilevel model (Bryk & Raudenbush, 1992). In effect, the intercept and slope parameters in a simple regression equation are estimated for each case. Variance in these individual growth parameters is modeled at Level 2 by the latent variables (Willett & Sayer, 1994). The means of the latent variables are the average intercept and linear slope for the sample. As with any regression equation, the intercept is the predicted value on the outcome, v in this case, when the value of the predictor, occasion in this case, is zero. The slope is the predicted change in v from one occasion to the next. In the example shown in the figure, I have made the first occasion the zero point; thus, the intercept latent variable reflects variability in v at baseline. Alternative patterns of fixed loadings (e.g., -1, 0, 1, 2) could have been used to change the occasion for which in the mean and variance of v are reflected in the intercept latent variable. With four observations of v, I could specify up to three trajectory shapes; in addition to linear, I could include quadratic (one-bend) and cubic (two-bend) trajectories. The observant reader might wonder how a model that includes four latent variables could be identified with only four indicators. This is possible because, typically, all of the loadings are fixed. In the simple model shown in the figure, I have fixed the loadings on the linear latent variable to values that reflect a linear trajectory. As such, the fit of the model reflects the degree to which the pattern of means fits a line. A third latent variable, one corresponding to a quadratic trajectory, would have fixed loadings of 0, 1, 4, and 9. Note that the first two loadings are the same for the linear and quadratic trajectories (such would be
89
05-Hoyle-4154-Ch-05.indd 89
11/01/2011 12:58:35 PM
structural equation modeling for social and personality psychology
the case for a cubic trajectory as well). I could capitalize on this fact and specify a model that, rather than specifying specific trajectories, allows the data to point to the trajectory that best fits the data. In such a model, I might relabel the linear latent variable simply “Slope,” and fix only the first two loadings. By freeing the last two loadings (fixed at 2 and 3 in the model shown in the figure), I permit them to be estimated from the data. Estimated values close to 2 and 3, respectively, for these loadings would suggest a linear pattern, whereas values closer to 4 and 9 would point to a quadratic pattern. The model shown in Figure 5.7 is an unconditional latent growth model, because no attempt is made to explain variability in the growth parameters. A conditional latent growth model could be specified in which other latent variables influence the intercept and linear latent variables. Significant paths from these predictor variables to the latent growth variables would indicate that standing on the growth variables is, to some extent, conditional on standing on the predictor variables. Although the predictor variables typically will be Level 2 variables that are measured at a single occasion, they, too, can be modeled as latent growth variables. In that case, the model could be used to test whether the intercept or pattern of change in one variable is conditional on the intercept or pattern of change in another variable. An interesting extension of latent growth modeling is growth mixture modeling. As the name suggests, growth mixture modeling is an integration of factor mixture modeling and latent growth modeling. As with the basic form of factor mixture modeling, a latent class variable is estimated that yields relatively homogeneous groups from a heterogeneous sample. Parameters in the measurement model, in this case a latent growth model, are compared between groups defined by the latent class variable. Comparison of latent variable means permits characterization of the groups in terms of variability in growth parameters. For instance, one group might be characterized by a linear trajectory that is essentially flat, whereas another group might be characterized by a negatively accelerating curvilinear trajectory. As with factor mixture models in general, potential predictors of class membership also can be included in the model. In the growth mixture case, significant predictors would be those variables that differentiate between classes defined by their growth parameters.
Models with Structural Paths To this point in the chapter, our focus has been exclusively on the measurement component of the more general model for which SEM methods are used. In the remainder of the chapter, our focus shifts to models that include directional, or structural, paths between some combination of latent variables and observed variables other than indicators of latent variables. When the structural paths involve one or more latent variables, it is assumed that the fit of the measurement model
90
05-Hoyle-4154-Ch-05.indd 90
11/01/2011 12:58:35 PM
modeling possibilities and practicalities
has been established at a level that allows for less than perfect fit of the structural model (Anderson & Gerbing, 1988). To explain, because the structural model typically involves adding restrictions to the measurement model, the value of indices of fit will almost always decrease when structural paths are added to a measurement model. Thus, if, for instance, the criterion to be used to judge fit of the overall model is CFI = .90, a more stringent criterion of, say, CFI = .95 should be used for the measurement model. There are many types of models that include structural paths. Some include only observed variables and are nothing more than multivariate multiple regression models. Others include latent variables, but, because they are routinely estimated from cross-sectional data or longitudinal data that do not include all variables at all waves, they benefit little from analysis using SEM methods. Although these uses of SEM are common in social and personality psychology, I do not review them here. Instead, I focus on structural models for which SEM offers a distinct benefit, perhaps making the difference between finding statistical support for a hypothesis and concluding the hypothesis should be rejected.
Mediational Models Theories in social and personality psychology typically go beyond simply positing the effect of one construct on another. They often articulate the processes by which the effect is presumed to occur. Although these processes can be inferred from the results of carefully designed experiments (Sigall & Mills, 1998), it is increasingly the case that attempts are made to measure them directly in the course of an experiment or within a survey. The variables that correspond to these processes are referred to as mediators, and, when they are measured directly, their putative role in producing an effect is tested in measurement-of-mediation models (Spencer, Zanna, & Fong, 2005). Traditionally, the effects in measurement-ofmediation models were estimated and tested in a series of multiple regression equations (Baron & Kenny, 1986). Because, with SEM, those equations can be estimated simultaneously and include latent variables, most new developments in the estimation and testing of mediated effects assume implementation using SEM. The simplest form of the measurement-of-mediation model includes three variables: an independent variable (i.e., predictor), a mediator variable, and a dependent variable (i.e., outcome). It has become standard to designate the independent variable as x, the dependent variable as y, and the mediator as z. The path from x to y, the direct effect, is labeled c. The path from x to y through z, the indirect effect, comprises two direct paths. The path from x to z is labeled a, and the path from z to y is labeled b. The product of a and b indexes the indirect effect and can be tested against zero using one of a number of standard errors. The test for mediation is most easily understood as a test of c in the presence of ab, or c′ (though, in the simple model, the test of ab is equivalent). If c′ is significantly
91
05-Hoyle-4154-Ch-05.indd 91
11/01/2011 12:58:35 PM
structural equation modeling for social and personality psychology
lower than c, then z is presumed to explain, at least in part, the effect of x on y. Our focus here is not interpretation, but it bears mention that the correct inference is not always clear for a host of reasons ranging from the effect of x on y when z is not included in the model (MacKinnon, Krull, & Lockwood, 2000) to how the standard error for ab is calculated and the critical ratio tested (MacKinnon, Lockwood, Hoffman, West, & Sheets, 2002). Why use SEM to estimate effects in the measurement-of-mediation model? In the simple measurement-of-mediation model, as ab increases, c decreases. If, in reality, z offers little in the explanation of the x-y relation, then a relatively small value of ab and relatively little difference between c and c′ is both expected and consistent with reality. If, however, z, in reality, contributes significantly to the explanation of the x-y relation but ab is underestimated, then the difference between c and c′ will be correspondingly underestimated and inconsistent with expectation and reality. Although a number of conditions might lead to the underestimation of ab, one that is addressed well by SEM is unreliability in z, the mediator (Hoyle & Kenny, 1999). The attenuating effect of unreliability in the mediator is evident in the equation c′obs = (1 – rzz) ab + c′true.
The quantity in parentheses indexes the degree of unreliability in z. When z is perfectly reliable (i.e., rzz = 1), then ab is reduced to zero and c′obs = c′true. As the degree of unreliability in z increases, its product with ab increases and the overestimation of c′true by c′obs increases. As would be expected, unreliability in one of the variables produces an attenuated effect; however, the net effect is an ironic overestimation of an effect that, ideally, would be small. SEM offers a means of estimating c′ with unreliability removed from z. When multiple indicators are available, this is as simple as modeling z as a latent variable. When z is represented by a single indicator in the data set and a well-documented value for the reliability of that indicator is available, unreliability can be removed from z by specifying it as a single indicator latent variable with the loading fixed to 1.0 and the uniqueness fixed to (1 - rzz)s2z (see Figure 2.1 and related text). With either strategy, the indirect effect is disattenuated and the likelihood of inferring mediation by z more likely when such an inference is warranted.
Latent Interactions In addition to articulating the processes that explain the causal effect of one variable on another, theoretical models in social and personality research often specify the conditions under which the effect would be stronger, weaker, or perhaps even reversed. Collectively, these conditions are referred to as moderators, and the strength and direction of their influence on the relation between two variables are
92
05-Hoyle-4154-Ch-05.indd 92
11/01/2011 12:58:35 PM
modeling possibilities and practicalities
tested as interaction effects. Assume that, as with the measurement-of-mediation model, x designates an independent variable and y a dependent variable. Now let z designate a variable hypothesized to moderate the x–y relation. Moderation by z is tested by determining whether the effect of the product of x and z, xz, differs from zero. As with tests of mediation in the measurement-of-mediation model, tests of moderation are compromised by unreliability, which can be effectively removed before estimating them in SEM. When, as with randomized experiments, both the moderator and independent variable are categorical, SEM offers no advantage over ANOVA for testing the interaction effect unless multiple indicators are available for any covariates, which could be modeled as latent variables. When either the moderator or independent variable is categorical and the other is continuous, then multigroup modeling can be used to remove unreliability from the continuous variable by modeling it either as a latent variable with multiple indicators or as a singleindicator latent variable with fixed uniqueness as described earlier (Bagozzi & Yi, 1989). In that implementation, a model is simultaneously estimated for groups defined by the categorical variable and equality constraints used to test whether the effect of the continuous variable modeled as a latent variable differs across levels of the categorical variable. If the χ2 difference is significant for the comparison of a model in which the effect of the latent variable has been constrained to be equal across levels of the categorical variable and one in which it is left free to vary, then the equality constraint is untenable, indicating a significant interaction effect. The advantage of SEM is most pronounced for tests of interaction involving two continuous variables. This is because the effect of unreliability is compounded by the fact that the test of the interaction effect is a test of the product of two variables. As might be expected, unreliability in the xz product is the product of the unreliability in x and z (Busemeyer & Jones, 1983). Specifically, the reliability of the interaction term can be expressed as rxx rzz + rxz2 1 + rxz2
’
where rxx and rzz are reliabilities of the independent and moderator variables, respectively, and rxz is the correlation between them. When the independent and moderator variables are uncorrelated, the reliability of the product term is the product of the two reliabilities. Thus, if we assume typical values of reliability, say .80, the reliability of the product term will vary between .64, when x and z are uncorrelated, and .70, when they are strongly correlated. This diminished reliability is especially problematic because the detection of moderated effects under typical research conditions is a challenge for reasons other than unreliability in measurement (McClelland & Judd, 1993).
93
05-Hoyle-4154-Ch-05.indd 93
11/01/2011 12:58:36 PM
structural equation modeling for social and personality psychology
SEM allows for the removal of unreliability in the product term by specification of a latent interaction variable. Imagine a model in which Fx is a latent independent variable with three reflective indicators, Fz is a latent moderator variable also with three reflective indicators, and FxFz is a latent interaction variable. Indicators of FxFz would be the products of all pairs of indicators between Fx and Fz (Kenny & Judd, 1984). For two reasons, this specification, though intuitive, is less than ideal. First, the number of indicators of the latent interaction variable can be very large even if neither Fx nor Fz have no more than a few indicators each. For example, the recommended minimum of three indicators of each yields nine indicators of FxFz. The potentially large number of indicators compounds the second problem: The loadings and uniquenesses associated with FxFz are nonlinear transformations of their counterparts in Fx and Fz (Kenny & Judd, 1984). Although this nonlinearity can be incorporated into the specification of FxFz using constraints, for more than a few indicators of Fx and Fz, the specification becomes prohibitively complex (though see Ping, 1995, for a workaround). It is now evident that valid estimates of the latent interaction effect can be obtained when FxFz is specified with fewer than the full set of kx (kz) indicators and no nonlinear constraints placed on parameters in the latent interaction variable (Marsh, Wen, Nagengast, & Hau, in press). Although the optimal number of indicators and strategy for testing parameters within the latent interaction variable has not yet been determined, current evidence favors an unconstrained matched-pairs strategy. Rather than all possible products, the strategy requires only enough products that every variable from the two sets is used once and only once. So, for instance, if the indicators of Fx are x1, x2, and x3, and the indicators of Fz are z1, z2, and z3, the optimal number of indicators is three: x1z1, x2z2, and x3z3. These are specified as indicators of FxFz and the model estimated with no constraints other than identifying constraints on parameters in FxFz. When specified in this way, coefficients on paths from FxFz are corrected for attenuation due to unreliability in the indicators of Fx and Fz.
Autoregressive Models Autoregressive models are those in which, in effect, a set of variables is regressed on itself at a prior point in time. Such models often have three or more waves of data, in which case the autoregression is repeated until the set of variables from each successive wave of data has been regressed on the set of variables from the immediately prior wave. The period of time between waves is referred to as lag, and effects from one wave to the next are lagged effects. In the prototypic model, only lag 1 effects are specified. That is, endogenous variables at one wave are influenced only by variables at the immediately prior wave. These effects are specified between each pair of waves so that, with three or more waves, the effects are replicated. In two-wave models, only lag 1 effects can be specified and they
94
05-Hoyle-4154-Ch-05.indd 94
11/01/2011 12:58:36 PM
modeling possibilities and practicalities
F1t1
F1t2
F1t3
F2t1
F2t2
F2t3
Figure 5.8 Cross-lagged panel model with latent variables (variances and parameters are omitted to improve readability)
are not replicated. With three or more waves of data, it is possible to estimate lag 2 or higher effects, though such effects are rare when the more proximal lag 1 effects have been accounted for. Also, the number of higher-order effects is limited by identification concerns. The estimation of autoregressive models requires panel data, in which all variables to be modeled are measured at each time point. Autoregressive models are an example of time-series models, though that term typically is reserved for models that include observation at many time points, perhaps “interrupted” by a naturally occurring event of interest or an intervention. In autoregressive models, two types of paths are of interest. These can be seen in Figure 5.8. The path diagram indicates a model in which four indicators of two latent variables have been measured at three points in time. The autoregressive paths are those in which a variable at one point in time is related to itself at a later point in time. Two types of autoregressive paths are evident in this model. Those
95
05-Hoyle-4154-Ch-05.indd 95
11/01/2011 12:58:37 PM
structural equation modeling for social and personality psychology
typical of most analyses of longitudinal data are the wave-to-wave directional paths for each latent variable (e.g., F1t1 → F1t2). These paths index variance in the latent variable that is stable from one wave to the next, and for that reason are sometimes referred to as stability paths. A second set of autoregressive paths are the wave-to-wave covariances between uniquenesses for each indicator. These assume that some portion of the uniquenesses is stable from wave to wave. This model includes only lag 1 paths. A second set of paths in the model are those between a latent variable at one wave and the other latent variable at the subsequent wave (F1t1 → F2t2 ). These are referred to as cross-lagged paths, contributing to one name for the model as a whole – the cross-lagged panel model. This model offers distinct advantages for studying the directionality of the relation between two variables when neither is (or could be) manipulated in a randomized experiment. Because stability is modeled directly, the cross-lagged paths do not reflect covariance between the stable portions of variance in the two variables. Because both directions of effect are modeled, neither is privileged a priori, and their relative strength can be compared statistically. And the outcome of this comparison is not affected by differential reliability of the two constructs. Both are modeled as latent variables; thus, all paths are corrected for attenuation due to unreliability. In most applications of this model, the paths of interest are the cross-lagged paths. They can be used to answer two fundamental questions: (1) Is there any relation between the two variables beyond the covariance of their stable components? (2) If so, is there evidence favoring one direction over the other? The first question is addressed by testing the significance of the parameter estimates for the cross-lagged paths. If an omnibus test is desired, the full set of cross-lagged paths can be set to zero and the fit of the resultant model compared to the fit of a model in which they are free using the χ2 difference test. If there is evidence of a directional relation, then the next step is to determine whether one path is stronger than the other. This is accomplished by fixing the pair of cross-lagged paths between a wave to equality and comparing the resultant model to the fit of a model without the constraint. If the fit of the model declines significantly, then the equality constraint is not tenable, indicating that the paths are not equivalent. The analyses could yield a number of different patterns, each leading to a different conclusion about the relation between the variables. If both cross-lagged path coefficients are nonsignificant, then there is no evidence of a directional relation. If only one of the two path coefficients is significant and the two paths significantly different, there is evidence favoring one direction of causality. If both path coefficients are significant (whether statistically equivalent or not), then the evidence favors bidirectional influence. One issue that arises when data from three or more waves are available is whether the within- and between-wave parameters are equivalent. Stationarity (an instance of invariance) holds when adding equality constraints for the same
96
05-Hoyle-4154-Ch-05.indd 96
11/01/2011 12:58:37 PM
modeling possibilities and practicalities
parameters from wave to wave does not produce a decline in fit from a model with no such constraints. The strategic use of equality constraints allows for focused tests of hypotheses and assumptions in the cross-lagged panel model more than any model I have described. I noted earlier that the trait–state–error model can be expanded to produce a form of the cross-lagged panel model. In the bivariate form of the model, the state component of a second variable is separated from its trait and error components. Cross-lagged paths are specified between the state component of each variable and the state component of the other variable. The trait–state–error form of the cross-lagged panel model is best used with four or more waves of data. A particular advantage of the model is that it allows for the management of unreliability with only a single indicator of each construct.
SEM in Social and Personality Psychology The growing set of statistical methods for modeling data that can be implemented within the SEM framework offers researchers in social and personality psychology a comprehensive, unified strategy for developing and testing multivariate models. As the selective review of modeling possibilities in this chapter shows, SEM is remarkably flexible, allowing the specification of models that reflect complex and subtle properties of the processes and structures characteristic of conceptual models and theories in social and personality psychology (see Hoyle, in press, for additional possibilities). Traditional methods such as ANOVA, multiple regression analysis, and exploratory factor analysis will, no doubt, continue to be useful for many analytic and modeling problems. However, as methods of data collection become more sophisticated (e.g., neuroimaging, genotyping, ecological momentary assessments) and hypotheses more subtle, these methods will increasingly fall short. SEM offers considerable promise as a means of keeping pace analytically with the methodological and conceptual developments in social and personality psychology.
97
05-Hoyle-4154-Ch-05.indd 97
11/01/2011 12:58:37 PM
05-Hoyle-4154-Ch-05.indd 98
11/01/2011 12:58:37 PM
References
Aiken, L. S., West, S. G., & Millsap, R. E. (2008). Doctoral training in statistics, measurement, and methodology in psychology: Replication and extension of Aiken, West, Sechrest, and Reno’s (1990) survey of PhD programs in North America. American Psychologist, 63, 32–50. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411–423. Arbuckle, J. L. (1996). Full information estimation in the presence of incomplete data. In G. A. Marcoulides, & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 243–278). Mahwah, NJ: Erlbaum. Bagozzi, R. P., & Yi, Y. (1989). On the use of structural equation models in experimental designs. Journal of Marketing Research, 26, 271–284. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. Baumrind, D. (1983). Specious causal attribution in the social sciences: The reformulated stepping-stone theory of heroin use. Journal of Personality and Social Psychology, 45, 1289–1298. Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling. Annual Review of Psychology, 31, 419–456. Bentler, P. M. (1986). Lagrange multiplier and Wald tests for EQS and EQS/PC. Los Angeles: BMDP Statistical Software. Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246. Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness-of-fit in the analysis of covariance structures. Psychological Bulletin, 88, 588–606. Bentler, P. M., & Chou, C.-P. (1986, April). Statistics for parameter expansion and contraction in structural models. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco. Bentler, P. M., & Speckart, G. (1979). Models of attitude-behavior relations. Psychological Review, 47, 265–276. Bentler, P. M., & Weeks, D. G. (1980). Linear structural equations with latent variables. Psychometrika, 45, 289–308. Blalock, H. M. (1961). Correlation and causality: The multivariate case. Social Forces, 39, 246–251. Blalock, H. M. (1964). Causal inferences in nonexperimental research. Chapel Hill, NC: University of North Carolina Press.
06-Hoyle-4154-References.indd 99
11/01/2011 12:59:02 PM
structural equation modeling for social and personality psychology
Blalock, H. M. (Ed.) (1971). Causal models in the social sciences. Chicago: Aldine-Atherton. Blunch, N. J. (2008). Introduction to structural equation modelling using SPSS and AMOS. Thousand Oaks, CA: Sage. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53, 605–634. Bollen, K. A., & Davis, W. R. (2009). Causal indicator models: Identification, estimation, and testing. Structural Equation Modeling, 16, 498–522. Bollen, K. A., & Hoyle, R. H. (1990). Perceived cohesion: A conceptual and empirical examination. Social Forces, 69, 479–504. Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110, 305–314. Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods and Research, 21, 205–229. Breckler, S. J. (1990). Applications of covariance structure modeling in psychology: Cause for concern? Psychological Bulletin, 107, 260–273. Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54, 106–148. Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press. Browne, M. W. (1984). Asymptotic distribution-free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Thousand Oaks, CA: Sage. Browne, M. W., & Shapiro, A. (1987). Adjustments for kurtosis in factor analysis with elliptically distributed errors. Journal of the Royal Statistical Society: Series B, 49, 346–352. Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models. Thousand Oaks, CA: Sage. Busemeyer, J. R., & Jones, L. D. (1983). Analysis of multiplicative combination rules when the causal variables are measured with error. Psychological Bulletin, 93, 549–562. Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: Basic concepts, applications, and programming. New York: Psychology Press. Byrne, B. M. (2006). Structural equation modeling with EQS: Basic concepts, applications, and programming (2nd ed.). Mahwah, NJ: Erlbaum. Byrne, B. M. (2010). Structural equation modeling with AMOS: Basic concepts, Applications, and programming (2nd ed.). New York: Taylor & Francis. Byrne, B. M. (2011). Structural equation modeling with Mplus: Basic concepts, applications, and programming. New York: Psychology Press. Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. Cliff, N. (1983). Some cautions concerning the application of causal modeling methods. Multivariate Behavioral Research, 18, 115–126.
100
06-Hoyle-4154-References.indd 100
11/01/2011 12:59:02 PM
references
Cohen, P., Cohen, J., Teresi, J., Marchi, M., & Velez, C. N. (1990). Problems in the measurement of latent variables in structural equations causal models. Applied Psychological Measurement, 14, 183–196. Cole, D. A., Martin, N. C., & Steiger, J. H. (2005). Empirical and conceptual problems with longitudinal trait–state models: Introducing a trait-state-occasion model. Psychological Methods, 10, 3–20. Cole, D. A., Maxwell, S. E., Arvey, R. D., & Salas, E. (1993). Multivariate group comparisons of variable systems: MANOVA and structural equation modeling. Psychological Bulletin, 114, 174–184. Collins, L. M., Schafer, J. L., & Kam, C.-H. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351. Crocker, J., Luhtanen, R. K., Cooper, M. L., & Bouvrette, A. (2003). Contingencies of selfworth in college students: Theory and measurement. Journal of Personality and Social Psychology, 85, 894–908. DeShon, R. P. (1998). A cautionary note on measurement error corrections in structural equation models. Psychological Methods, 4, 412–423. Duncan, O. D. (1966). Path analysis: Sociological examples. American Journal of Sociology, 74, 119–137. Duncan, O. D. (1975). Introduction to structural equation models. New York: Academic Press. Duvall, J. L., Hoyle, R. H., & Robinson, J. I. (2010). Contingency of self-esteem on appearance, salience of appearance, and awareness of the public self. Manuscript submitted for publication. Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155–174. Fenigstein, A., Scheier, M. G., & Buss, A. H. (1975). Public and private self-consciousness: Assessment and theory. Journal of Consulting and Clinical Psychology, 43, 522–528. Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research. Reading, MA: Addison-Wesley. Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9, 466–491. Fox, J. (2006). Structural equation modeling with the sem package in R. Structural Equation Modeling, 13, 465–486. Freedman, D. A. (1987). As others see us: A case study in path analysis. Journal of Educational Statistics, 12, 101–128. Goldberger, A. S. (1971). Econometrics and psychometrics: A survey of commonalities. Psychometrika, 36, 83–107. Goldberger, A. S., & Duncan, O. D. (Eds.) (1973). Structural equation models in the social sciences. New York: Academic Press. Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8, 206–213. Habashi, M. M., & Wegener, D. (2008). Preliminary evidence that agreeableness is more closely related to responsiveness than conformity. Unpublished manuscript, Purdue University.
101
06-Hoyle-4154-References.indd 101
11/01/2011 12:59:02 PM
structural equation modeling for social and personality psychology
Hershberger, S. L. (1994). The specification of equivalent models before the collection of data. In A. von Eye, & C. Clogg (Eds.), The analysis of latent variables in developmental research (pp. 68–105). Thousand Oaks, CA: Sage. Hoyle, R. H. (2007). Applications of structural equation modeling in personality research. In R. Robins, C. Fraley, & R. Krueger (Eds.), Handbook of research methods in personality psychology (pp. 444–460). New York: Guilford Press. Hoyle, R. H. (Ed.) (in press). Handbook of structural equation modeling. New York: Guilford Press. Hoyle, R. H., & Duvall, J. L. (2004). Determining the number of factors in exploratory and confirmatory factor analysis. In D. Kaplan (Ed.), Handbook of quantitative methodology for the social sciences (pp. 301–315). Thousand Oaks, CA: Sage. Hoyle, R. H., & Kenny, D. A. (1999). Sample size, reliability, and tests of statistical mediation. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 195–222). Thousand Oaks, CA: Sage. Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–5. Johnson, D. R., & Creech, J. C. (1983). Ordinal measures in multiple indicator models: A simulation study of categorization error. American Sociological Review, 48, 398–407. Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 32, 443–482. Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202. Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426. Jöreskog, K. G. (1973). A general method for estimating a linear structural equation system. In A. S. Goldberger, & O. D. Duncan (Eds.), Structural equation models in the social sciences (pp. 85–112). New York: Academic Press. Jöreskog, K. G. (1993). Testing structural equation models. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 294–316). Thousand Oaks, CA: Sage. Kaplan, D. (1990). Evaluating and modifying covariance structure models. A review and recommendation. Multivariate Behavioral Research, 25, 137–155. Kaplan, D. (2009). Structural equation modeling: Foundations and extensions (2nd ed.). Thousand Oaks, CA: Sage. Keesling, J. W. (1972). Maximum likelihood approaches to causal analysis. Doctoral dissertation, University of Chicago. Kelloway, E. K. (1998). Using LISREL for structural equation modeling: A researcher’s guide. Thousand Oaks, CA: Sage. Kenny, D. A., & Judd, C. M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201–210. Kenny, D. A., & Kashy, D. A. (1992). Analysis of multitrait-multimethod matrix by confirmatory factor analysis. Psychological Bulletin, 112, 165–172. Kenny, D. A., & Zautra, A. (1995). The trait-state-error model for multiwave data. Journal of Consulting and Clinical Psychology, 63, 52–59. Kline, R. B. (2010). Principles and practice of structural equation modeling (3rd ed.). New York: Guilford Press. Lee, S., & Hershberger, S. (1990). A simple rule for generating equivalent models in covariance structure modeling. Multivariate Behavioral Research, 25, 313–334.
102
06-Hoyle-4154-References.indd 102
11/01/2011 12:59:02 PM
references
Lee, S.-Y., & Tang, N.-S. (2006). Bayesian analysis of structural equation models with mixed exponential family and ordered categorical data. British Journal of Mathematical and Statistical Psychology, 59, 151–172. Loehlin, J. C. (1992). Genes and environment in personality development. Thousand Oaks, CA: Sage. Lubke, G. H., & Muthén, B. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10, 21–39. MacCallum, R. C. (2003). Working with imperfect models. Multivariate Behavioral Research, 38, 113–139. MacCallum, R. C., & Browne, M. W. (1993). The use of causal indicators in covariance structure models: Some practical issues. Psychological Bulletin, 114, 533–541. MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determining sample size for covariance structure modeling. Psychological Methods, 1, 130–149. MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111, 490–504. MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological Methods, 114, 185–199. MacKinnon, D. P., Krull, J. L., & Lockwood, C. M. (2000). Equivalence of the mediation, confounding, and suppression effect. Prevention Science, 1, 173–181. MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to test the significance of the mediated effect. Psychological Methods, 7, 83–104. Marsh, H. W., Byrne, B. M., & Craven, R. (1992). Overcoming problems in confirmatory factor analyses of MTMM data: The correlated uniqueness model and factorial invariance. Multivariate Behavioral Research, 27, 489–507. Marsh, H. W., Hau, K.-T., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33, 181–220. Marsh, H. W., Wen, Z., Nagengast, B., & Hau, K.-T. (in press). Latent-variable approaches to tests of interaction effects. In R. H. Hoyle (Ed.), Handbook of structural equation modeling. New York: Guilford Press. McClelland, G. H., & Judd, C. M. (1993). Statistical difficulties of detecting interactions and moderator effects. Psychological Bulletin, 114, 376–390. Meehl, P. E., & Waller, N. G. (2002). The path analysis controversy: A new statistical approach to strong appraisal of verisimilitude. Psychological Methods, 7, 283–300. Muchinsky, P. M. (1996). The correction for attenuation. Educational and Psychological Measurement, 56, 63–75. Mueller, R. O. (1999). Basic principles of structural equation modeling: An introduction to LISREL and EQS. New York: Springer. Mulaik, S. A., & Millsap, R. E. (2000). Doing the four-step right. Structural Equation Modeling, 7, 36–73. Muthén, B. O. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 43–65. Myung, J. (2003). Tutorial on maximum likelihood estimation. Journal of Mathematical Psychology, 47, 90–100.
103
06-Hoyle-4154-References.indd 103
11/01/2011 12:59:02 PM
structural equation modeling for social and personality psychology
Newcomb, M. D. (1994). Drug use and intimate relationships among women and men: Separating specific from general effects in prospective data using structural equation models. Journal of Consulting and Clinical Psychology, 62, 463–476. Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14, 535–569. Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). New York: Cambridge University Press. Pearl, J. (2010). An introduction to causal inference. The International Journal of Biostatistics, 6(2). Retrieved June 22, 2010 from www.bepress.com/cgi/viewcontent. cgi?article=1203&context=ijb. Ping, R. A., Jr. (1995). A parsimonious estimating technique for interaction and quadratic latent variables. Journal of Marketing Research, 32, 336–347. Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized multilevel structural equation modeling. Psychometrika, 69, 167–190. Raykov, T., & Marcoulides, G. A. (2001). Can there be infinitely many models equivalent to a given covariance structure model? Structural Equation Modeling, 8, 142–149. Reuman, D., Alwin, D., & Veroff, J. (1984). Assessing the validity of the achievement motive in the presence of random measurement error. Journal of Personality and Social Psychology, 47, 1347–1362. Rindskopf, D., & Rose, T. (1988). Some theory and applications of confirmatory secondorder factor analysis. Multivariate Behavioral Research, 23, 51–67. Rodgers, J. L. (2010). The epistemology of mathematical and statistical modeling: A quiet methodological revolution. American Psychologist, 65, 1–12. Rudman, L. (2011). Implicit measures for social and personality psychology. London: Sage. Salzberger, T., Sinkovics, R. R., & Schlegelmilch, B. B. (1999). Data equivalence in crosscultural research: A comparison of classical test theory and latent trait theory based approaches. Australasian Marketing Journal, 7, 23–38. Saris, W. E., & Satorra, A. (1993). Power evaluations in structural equation models. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 181–204). Thousand Oaks, CA: Sage. Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage. Satorra, A., & Bentler, P. M. (2001). A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66, 507–514. Satorra, A., & Saris, W. E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83–90. Schumacker, R. E., & Lomax, R. G. (2004). A beginner’s guide to structural equation modeling (2nd ed.). Mahwah, NJ: Erlbaum. Sigall, H., & Mills, J. (1998). Measures of independent variables and mediators are useful in social psychology experiments: But are they necessary? Personality and Social Psychology Review, 2, 218–226. Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27, 229–239.
104
06-Hoyle-4154-References.indd 104
11/01/2011 12:59:02 PM
references
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384. Sörbom, D. (2001). Karl Jöreskog and LISREL: A personal story. In R. Cudeck, S. Du Toit, & D. Sörbom (Eds.), Structural equation modeling: Present and future (pp. 3–10). Lincolnwood, IL: Scientific Software. Spencer, S. J., Zanna, M. P., & Fong, G. T. (2005). Establishing a causal chain: Why experiments are often more effective than mediational analysis in examining psychological processes. Journal of Personality and Social Psychology, 89, 845–851. Steiger, J. H. (2001). Driving fast in reverse: The relationship between software development, theory, and education in structural equation modeling. Journal of the American Statistical Association, 96, 331–338. Steiger, J. H. (2002). When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling. Psychological Methods, 7, 210–227. Steiger, J. H., & Lind, J. C. (1980, May). Statistically based tests for the number of common factors. Paper presented at the Annual Meeting of the Psychometric Society, Iowa City, IA. Stelzl, I. (1986). Changing a causal hypothesis without changing the fit: Some rules for generating equivalent path models. Multivariate Behavioral Research, 21, 309–331. Tanaka, J. S., & Bentler, P. M. (1983). Factor invariance of premorbid social competence across multiple populations of schizophrenics. Multivariate Behavioral Research, 18, 135–146. Tanaka, J. S., & Huba, G. J. (1984). Confirmatory hierarchical factor analysis of psychological distress measures. Journal of Personality and Social Psychology, 46, 621–635. Thurstone, L. L. (1954). An analytical method for simple structure. Psychometrika, 19, 173–194. West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 56–75). Thousand Oaks, CA: Sage. Weston, R., & Gore, P. A., Jr. (2006). A brief guide to structural equation modeling. The Counseling Psychologist, 34, 719–751. Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 281–323). Washington, DC: American Psychological Association. Wiley, D. E. (1973). The identification problem for structural equation models with unmeasured variables. In A. S. Goldberger, & O. D. Duncan (Eds.), Structural equation models in the social sciences (pp. 69–83). New York: Academic Press. Willett, J. B., & Sayer, A. G. (1994). Using covariance structure analysis to detect correlates and predictors of individual change over time. Psychological Bulletin, 116, 363–381. Wright, S. (1920). The relative importance of heredity and environment in determining the piebald pattern of guinea-pigs. Proceedings of the National Academy of Sciences, 6, 320–332. Wright, S. (1934). The method of path coefficients. Annals of Mathematical Statistics, 5, 161–215. Young, F. W. (1996). ViSta: The visual statistics system. Chapel Hill, NC: L.L. Thurstone Psychometric Laboratory Research Memorandum 94–1(c).
105
06-Hoyle-4154-References.indd 105
11/01/2011 12:59:02 PM
06-Hoyle-4154-References.indd 106
11/01/2011 12:59:02 PM
Index Aiken, L. S., 6, 53 Ajzen, I., 6 Alwin, D., 6 alternative models approach, 7, 8, 11, 16, 49, 78–9 analysis of variance (ANOVA), 1, 2, 3, 23–5 Anderson, J. C., 22, 91 Arbuckle, J. L., 51 Arvey, R. D., 21 Asparouhov, T., 88 augmented moment matrix, 87 autoregressive models, 94–7 Bagozzi, R. P., 9, 93 Balla, J. R., 20 Baron, R. M, 8, 91 Baumrind, D., 6, 18, 23 Bentler, P. M., 6, 17, 30, 44, 46–50, 58–61, 66, 70–2 Bentler-Weeks notation, 30–3 Blalock, H. M., 5, 23, 70 Blunch, N.J., 52 Bollen, K.A., 1, 44, 76–7, 86 Bonett, D.G., 17, 46 bootstrapping, 44 Bouvrette, A., 62 Breckler, S. J., 6, 18 Briggs, S. R., 60 Brown, T. A., 76 Browne, M. W., 43–4, 48, 50, 66, 77 Bryk, A. S., 89 Busemeyer, J. R., 93 Buss, A. H., 62 Byrne, B. M., 52–3, 83, 86 Campbell, D. T., 83 categorical, 44 categorical variable methodology (CVM), 44 causality, 24–5, 70–1 Cheek, J. M., 60 chi-square (χ2) test, 45–6 Chou, C.-P., 59, 61 Cliff, N., 6, 18
07-Hoyle-4154-Index.indd 107
Cohen, J., 9 Cohen, P., 9 Cole, D. A., 21, 84 Collins, L.M., 51 comparative fit index (CFI), 47 confirmatory factor analysis, 76, 79 convergence, 11, 38, 41–2 Cooper, M. L., 62 correlated uniquenesses, 57, 66, 83 covariance, 64 covariance structure modeling, 2–3 Craven, R., 83 critical ratio, 50, 68–9 Creech, J. C., 43 Crocker, J., 62 cross-lagged panel model, 96 Cudeck, R., 48, 66 Curran, P. J., 43, 80 Davis, W. R., 77 degrees of freedom, 10, 45, 64 DeShon, R. P., 21, 70 direct effect, 8 directional relation, 23 disturbance, 9 Duncan, O. D., 5 Duvall, J. L., 62, 79 Edwards, J. R., 9 endogenous variable, 9 error, 4 equality constraints, 28–9, 75, 86, 96 equivalent models, 11, 72–3 estimation, 10, 17, 37, see also maximum likelihood estimation evaluation of fit, 17, 48, 66 exogenous variable, 9 Fabrigar, L. R., 72 factor, see latent variable factor mixture model, 87–8 Fenigstein, A., 62 Finch, J. F., 43
11/01/2011 12:59:22 PM
structural equation modeling for social and personality psychology
Fishbein, M., 6 Fiske, D. W., 83 fit, 11, 27, 45, see also evaluation of fit fitting function, 41, 44 fixed parameters, see parameter, fixed Flora, D. B., 80 Fong, G. T., 91 Fox, J., 53 free parameters, see parameter, free Freedman, D. A., 18 full information maximum likelihood (FIML), 51 Gerbing, D. W., 22, 91 Gilreath, T.D., 51 Goldberger, A. S., 5 goodness of fit, see evaluation of fit Gore, P. A., Jr., 1 Graham, J. W., 51 Grayson, D., 20 growth curve modeling, see latent growth modeling growth mixture modeling, 90 Habashi, M. M., 29 Hau, K.-T., 20, 94 Hershberger, S. L., 73 Heywood case, 41–2, 83 Hoffman, J. M., 92 Hoyle, R. H., 1, 14, 62, 79, 86, 92, 97 Hu, L.-T., 47, 66 Huba, G. J., 6 hypothesis, 17, 37, 45 identification, 10, 26–8, 36–8, 77 independence model, 46 implied covariance matrix, 10, 11, 37, 39, 42 indicator, 9, 19–21 formative, 9, 76–7 reflective, 9, 76 indirect effect, 8–9 interpretation, 18 invariance, see measurement invariance iteration, 10, 17, 38, 41 Johnson, D. R., 43 Jones, L. D., 93 Jöreskog, K. G., 5, 7, 38, 52, 86 Judd, C. M., 93, 94 just identified, 10, 36 Kam, C-H., 51 Kaplan, D., 1, 50
Keesling, J. W., 5 Kelloway, E. K., 52 Kenny, D. A., 8, 14, 83, 91–2, 94 Kashy, D. A., 83 Kline, R. B., 1 KruII, J. L., 92 Lagrange multiplier (LM) test, 59, 67 latent growth model, 88–90 latent interaction, 92–4 latent mean, see structured mean latent variable, 1, 9, 20–1, 70 first order, 76 higher order, 80 Lee, S., 73 Lee, S.-Y., 44 Lennox, R., 76 Lind, J. C., 47, 66 LISREL model, 5–6 LISREL notation, 33–4 LISREL software, 51–2 Lockwood, C. M., 92 Loehlin, J. C., 72 Lomax, R. G., 1 Lubke, G. H., 87 Luhtanen, R. K., 62 MacCallum, R. C., 7, 44, 50–1, 55, 72, 77 MacKinnon, D. P., 92 Marchi, M., 9 Marcoulides, G. A., 72 Marsh, H. W., 20, 83, 94 Martin, N. C., 84 maximum likelihood (ML) estimation, 38–43 Maxwell, S. E., 21 McClelland, G. H., 93 measurement error, 22, 70, 92–3 measurement invariance, 84–7 measurement model, see model, measurement mediation, 91–2 Meehl, P. E., 44 Mills, J., 91 Millsap, R. E., 6, 47 missing data, 51 model, 7 comparison, see nested model comparison, measurement, 5, 22 structural, 5, 22 specification, see specification model generating approach, 7 model modification, 11, 17, 18, 54–56, 72 model vs. analyze, 4
108
07-Hoyle-4154-Index.indd 108
11/01/2011 12:59:22 PM
index
moderation, see latent interaction modification index, see Lagrange multiplier (LM) test Muchinsky, P. M., 14 Mueller, R. O., 52 Mulaik, S. A., 47 multigroup modeling, 75 multiple regression analysis, 1, 8, 24–5 multitrait-multimethod model, 82–3 multivariate normality, 42–3, 48, 64 Muthén, B. O., 44, 80, 86–8 Myung, J., 38 Nagengast, B., 94 Necowitz, L. B., 7 nested models, 11, 49, 57 test (∆χ2), 49, 57, 58 Newcomb, M. D., 60 nondirectional relation, 23 nonnormality, 48, 57– 8 normed fit index (NFI), 46 notation, see Bentler-Weeks notation, see LISREL notation null model, see independence model Nylund, K. L., 88 Olchowski, A. E., 51 overidentified, 10, 27, 37 parameter, 1, 8 estimates, 39–41 fixed, 8, 14, 25–6, 28 free, 8, 14, 25 parcel, 19–20 path analysis, 5, 12 path diagram, 12–14, 29–30, 85 examples, 13, 22, 30. 57, 63, 73, 77–8, 81–2, 84–5, 89, 95 Pearl, J., 71 Pickles, A., 53 Ping, R. A., Jr., 94 power, 50–1 Rabe-Hesketh, S., 53 Raudenbush, S.W., 89 Raykov, T., 72 Reise, S. P., 86 residual matrix, 11 residual covariance, 56 Reuman, D., 6 Rindskopf, D., 62, 81 Robinson, J. I., 62
robust statistics, 44, 48 Rodgers, J. L., 4 Rose, T., 62, 81 root mean square error of approximation (RMSEA), 47–8, 50–1 Roznowski, M., 7 Rudman, L., 20 Salas, E., 21 Salzberger, T., 87 sample size, 43, 50, 54–5, 71–2 Saris, W. E., 50 Satorra, A., 44, 48–50, 58 Sayer, A. G., 88–9 scaled statistics, 44, 48–50, 58 Schafer, J. L., 51 Scheier, M. G., 62 Schlegelmilch, B. B., 87 Schumacker, R. E., 1 Shapiro, A., 44 Shavelson, R. J., 86 Sheets, V., 92 Sigall, H., 91 simple structure, 78 Sinkovics, R. R., 87 Skrondal, A., 53 software, 51–3 Sōrbom, D., 51–2, 59, 75 specific effect, 60 specification, 7, 16–17 specification searching, 11, 55 automated, 55, 58–61, 67 manual, 55–8, 66 Speckart, G., 6 Spencer, S. J., 91 start values, 10, 39 Steiger, J. H., 26, 47, 53, 66, 84 Stelzl, I., 73 Stine, R. A., 44 strictly confirmatory approach, 7 structural model, see model, structural structure, 10 structured mean, 86, 88 subfactor, 62–3, 81–4 Sugawara, H. M., 50 Tanaka, J. S., 6 Tang, N.-S., 44 Teresi, J., 9 Thurstone, L. L., 78 total effect, 9 trait-state-error model, 83–4
109
07-Hoyle-4154-Index.indd 109
11/01/2011 12:59:22 PM
structural equation modeling for social and personality psychology
Uchino, B.N., 72 underidentified, 10, 26 unidentified, see underidentification uniqueness, 9, 13, 21, see also correlated uniquenesses unrestricted factor model, 79 Velez, C. N., 9 Veroff, J., 6 Wald(W) test, 60–1, 79 Waller, N. G., 44 Weeks, D. G., 30 Wegener, D. T., 29, 72 weight matrix, 43
weighted least squares estimation, 80 Wen, Z., 94 West, S. G., 6, 43, 92 Weston, R., 1 Widaman, K. F., 86 Wiley, D. E., 5 Willett, J.B., 88–9 Wright, S., 5, 12 Yi, Y., 93 Young, F. W., 42 z test, see critical-ratio Zanna, M. P., 91 Zautra, A., 83
110
07-Hoyle-4154-Index.indd 110
11/01/2011 12:59:22 PM
07-Hoyle-4154-Index.indd 111
11/01/2011 12:59:22 PM
07-Hoyle-4154-Index.indd 112
11/01/2011 12:59:22 PM
07-Hoyle-4154-Index.indd 113
11/01/2011 12:59:22 PM
07-Hoyle-4154-Index.indd 114
11/01/2011 12:59:22 PM