145 6 20MB
English Pages 412 [417] Year 1998
Structural Equation Modeling With LISREL, PRELIS, and SIMPLIS: Basic Concepts, Applications, and Programming
MULTIVARIATE APPLICATIONS BOOK SERIES I am delighted and honored to be introducing the latest book from Barbara Byrne, a remarkable scholar and methodological spokesperson. As the second book in Lawrence Erlbaum Associates' recent Multivariate Application book series, Structural Equation Modeling with LISREL, PRELIS, and SIMPLIS describes intricate methodology clearly and understandably. What's more, Dr. Byrne leaves readers assured that they can also tackle the nuances and complexities of structural equation modeling with ease and facility. This kind of book epitomizes the very goals and substance that were intended for this series. Dr. Byrne's background is aptly suited for turning out the kind of high-quality scholarship that has become her trademark. Her research focuses on structural equation modeling, largely applied to self-concept data. Over the last 15 years, Barbara's laudatory and prolific record includes several major books, dozens of research publications, and numerous presentations and invited addresses. In quantitative terms alone, Barbara's work is impressive. A qualitative review of her scholarship is equally compelling. Few people could match Barbara's talent for describing complex methodology in completely understandable words. Barbara's writings on structural equation modeling are probably on the shelves of every academic institution and laboratory in North America. Whereas many, if not most, researchers focus their work on a limited academic audience, Barbara prefers to write in a manner that is accessible to a wide array of researchers, methodologists, teachers, practitioners, and students. I have used her structural equation modeling books in a graduate class that I teach and they have been unequivocal hits! In closing, I am pleased to welcome Structural Equation Modeling with LISREL, PRELIS, and SIMPLIS, a book that conveys Dr. Byrne's erudite knowledge in a thoroughly lucid and comprehensible style. Lisa L. Harlow, Ph.D., Series Editor Department of Psychology University of Rhode Island 10 Chafee Rd., Suite 8 Kingston, RI 02881-0808 Phone: 401-874-4242 Fax: 401-874-5562 E-Mail: [email protected] Other books in the series:
Harlow/MulaiklSteiger • What If There Were No Significance Tests?
Structural Equation Modeling with LISREL, PRELIS, and SIMPLIS: Basic Concepts, Applications, and Programming Barbara M. Byrne University of Ottawa
LAWRENCE ERLBAUM ASSOCIATES, PUBLISHERS 1998
Mahwah, New Jersey
London
Copyright © 1998 by Lawrence Erlbaum Associates, Inc. All rights reserved. No part of this book may be reproduced in any form, by photostat, microfilm, retrieval system, or any other means, without the prior written permission of the publisher. Lawrence Erlbaum Associates, Inc., Publishers 10 Industrial Avenue Mahwah, New Jersey 07430 Cover Design by Kathryn Houghtaling Lacey
Library of Congress Cataloging-in-Publication Data Byrne, Barbara M. Structural equation modeling with LISREL, PRELIS, and SIMPLIS : basic concepts, applications, and programming / Barbara M. Byrne. p. em. Includes bibliographical references and index. ISBN 0-8058-2924-5 (alk. paper) 1. Multivariate analysis. 2. Social sciences-Statistical methods. I. Tide. QA278.B97 1998 519.5'35-dc21 97-49492 CIP Books published by Lawrence Erlbaum Associates are printed on acid-free paper, and their bindings are chosen for strength and durability. Printed in the United States of America 10 9 8 7 6 5
Contents Preface
ix
Part I: Introduction
1
2
Structural Equation Models: The Basics
3
Basic Concepts
4
The General LlSREL Model
9
The LlSREL Confirmatory Factor Analytic Model
17
The LlSREL Full Structural Equation Model
37
Using LISREL, PRELIS, and SIMPLIS
43
Working with LlSREL 8
43
Working with PRELIS 2
69
Working with SIMPLIS
81
Overview of Remaining Chapters
87
Part II: Single-Group Analyses
3
Application 1: Testing the Factorial Validity of a Theoretical Construct (First-Order CFA Model)
91
The Hypothesized Model
91
The LlSREL Input File
94
The LlSREL Output File
100
Post Hoc Analyses
125
From the SIMPLIS Perspective
130
Applications Related to Other Disciplines
134 v
vi
Contents
4
5
6
Application 2: Testing the Factorial Validity of Scores from a Measuring Instrument (First-Order CFA Model)
135
The Measuring Instrument Under Study
136
The Hypothesized Model
137
Post Hoc Analyses
147
From the SIMPLIS Perspective
156
Cross-Validation in Covariance Structure Modeling
157
Applications Related to Other Disciplines
161
Application 3: Testing the Factorial Validity of Scores from aMeasuring Instrument (Second-Order CFA Model)
163
The Measuring Instrument Under Study
164
Analysis of Categorical Data
165
Preliminary Analyses Using PRELIS
166
The Hypothesized Model
170
Post Hoc Analyses
181
From the SIMPLIS Perspective
189
Applications Related to Other Disciplines
191
Application 4: Testing for Construct Validity: The Multitrait-Multimethod Model
193
Preliminary Analyses Using PRELlS: Data Screening
194
The CFA Approach to MTMM Analyses
199
The General CFA MTMM Model
200
Testing for Construct Validity: Comparison of Models
213
Testing for Construct Validity: Examination of Parameters
215
The Correlated Uniqueness MTMM Model
218
From the SIMPLIS Perspective
227
Applications Related to Other Disciplines
229
vii
Contents
7
Application 5: Testing the Validity of a Causal Structure
231
The Hypothesized Model
231
Preliminary Analyses Using PRELIS
233
Preliminary Analyses Using LlSREL
234
Testing the Hypothesized Model of Causal Structure
238
Post Hoc Analyses
244
From the SIMPLIS Perspective
253
Applications Related to Other Disciplines
255
Part III: Multiple Group Analyses
8
9
Application 6: Testingfor Invariant Factorial Structure of a Theoretical Construct (First-Order CFA Model)
259
Testing for Multigroup Invariance
260
Testing for Invariance Across Gender
263
Hypothesis I
266
Hypothesis II
272
Hypothesis III
274
From the SIMPLIS Perspective
281
Applications Related to Other Disciplines
286
Application 7: Testingfor Invariant Factorial Structure of Scores from a Measuring Instrument (First-Order CFA Model)
287
Testing for Invariance Across Low and High Academic Tracks
288
Testing Hypotheses Related to Multigroup Invariance
291
Hypothesis I
291
Hypothesis II
294
Hypothesis III
294
Hypothesis IV
296
viii
10
11
Contents
From the SIMPLIS Perspective
300
Applications Related to Other Disciplines
302
Application 8: Testing for Invariant Latent Mean Structures
303
Basic Concepts Underlying Tests of Latent Mean Structures
304
The Hypothesized Model
309
From the SIMPLIS Perspective
322
Applications Related to Other Disciplines
324
Application 9: Testing for Invariant Pattern of Causal Structure
327
The Hypothesized Model
328
Testing for Invariance Across CalibrationNalidation Samples
330
From the SIMPLIS Perspective
338
Applications Related to Other Disciplines
341
Part IV: LlSREL, PRE LIS, and SIMPLIS Through Windows
12 Application 10: Testingfor Causal Predominance Using a Two-Wave Panel Model
345
Using the Windows Versions of LlSREL 8 and PRELIS 2
346
Running LlSREL 8 and PRELIS 2
349
The Hypothesized Model
351
From the SIMPLIS Perspective
381
Applications Related to Other Disciplines
385
References
387
Author Index
397
Subject Index
401
Preface The intent of the present book is to illustrate the ease with which various features of the LISREL 8 program and its companion preprocessor program, PRELIS 2, can be implemented in addressing research questions that lend themselves to structural equation modeling (SEM). Specifically, the purpose is threefold: (a) to present a non mathematical introduction to basic concepts associated with SEM, (b) to demonstrate basic applications of SEM using both the DOS and Windows versions of LISREL 8, as well as both the LISREL and SIMPLIS lexicons, and (c) to highlight particular features of the LISREL 8 and PRELIS 2 programs that address important caveats related to SEM analyses. This book is intended neither as a text on the topic of SEM, nor as a comprehensive review of the many statistical functions available in the LISREL 8 and PRELIS 2 programs. Rather, the intent is to provide a practical guide to SEM using the LISREL approach. As such, the reader is "walked through" a diversity of SEM applications that include both factor analytic and full latent variable models, as well as a variety of data management procedures. Ideally, this volume serves best as a companion book to the LISREL 8 oreskog & Sorbom, 1996b) and PRELIS 2 oreskog & Sorbom, 1996c) manuals, as well as to any statistics textbook devoted to the topic of SEM. The book is divided into four major sections. In Part I, I introduce basic concepts associated with SEM in general, and the LISREL approach to SEM in particular (chap. 1). Also detailed and illustrated in this chapter are the fundamental equations and matrices associated with the LISREL program. Chapter 2 focuses solely on the LISREL 8 and PRELIS 2 programs. Here, I detail the key elements needed in building files related to both programs. I also review the relatively new SIMPLIS language and illustrate the structuring of input files based on this lexicon. Part II focuses on applications involving single-group analyses; these include three first-order confirmatory factor analytic (CFA) models, one second-order CF A model, and one full latent variable model. Accompanying each of these applications is at least one illustrated use of the PRELIS 2 program. The first-order CF A applications demonstrate testing for the validity of the theoretical structure of a construct (chap. 3), the factorial structure of a measuring instrument (chap. 4), and multiple traits assessed by multiple methods within
a
a
ix
Preface
x
the framework of a multitrait-multimethod (MTMM) design (chap. 6); unique to the present book are applications of both the General CF A and Correlated Uniqueness MTMM models. The second-order CFA model bears on the factorial structure of a measuring instrument (chap. 5). The final single-group application tests for the validity of a hypothesized causal structure (chap. 7). The PRELIS 2 examples demonstrate the procedures involved in creating a smaller subset of data (chap. 3), a covariance matrix (chap. 4), and polychoric and asymptotic matrices (chap. 5). Also illustrated are the descriptive statistics available for the screening of data (chaps. 6 & 7). In Part III, I present applications related to multigroup analyses. Specifically I show how to test for measurement and structural invariance related to a theoretical construct (chap. 8), a measuring instrument (chap. 9), and a causal structure (chap. 11). Finally, in chapter 10, I outline basic concepts associated with latent means structures and demonstrate testing for their invariance across groups. Part IV contains the final chapter of the book, chapter 12, which is devoted to the new Windows versions of both the LISREL 8 and PRELIS 2 programs. For purposes of demonstrating assorted aspects of the Microsoft Windows environment, the final application of SEM involves a two-wave panel study. The chapter begins by first describing the various drop-down menus, and then by detailing each of the options made available to the user. Of particular importance is the diversity of options associated with the Path Diagram menu; a broad array of these schematic displays are illustrated and described. Finally, a further example is provided in demonstrating the use of PRELIS 2 to formulate a subset of data. In writing a book of this nature, it was essential that I have access to a number of different data sets that lent themselves to various applications. To facilitate this need, all examples presented throughout the book are drawn from my own research; related journal references are cited for readers who may be interested in a more detailed discussion of theoretical frameworks, aspects of the methodology, and/or substantive issues and findings. In summary, each application in the book is accompanied by the following: • • • • • • •
a statement of the hypothesis to be tested; a schematic representation of the model under study; a full explanation bearing on related LISREL 8 and PRELIS 2 input files; a full explanation and interpretation of selected LISREL 8 and PRELIS 2 output files; at least one model input file based on the SIMPLIS language; the published reference from which the application was drawn; published references related to disciplines other than psychology.
Preface
xi
It is important to emphasize that, although all applications are based on data that are of a social/psychological nature, they could just as easily have been based on data representative of the health sciences, leisure studies, marketing, or a multitude of other disciplines; my data, then, serve only as one example of each application. Indeed, I strongly recommend seeking out and examining similar examples as they relate to other subject areas. To assist in this endeavor, I have provided a few references at the end of each application chapter (chaps. 3-12). Because many readers are familiar with my earlier book (referenced here as The Primer), which also addressed the basic concepts of SEM and demonstrated applications based on the LISREL program (Byrne, 1989), it seems prudent that I outline the many features of the present volume that set it apart from its predecessor. Indeed, there are several major differences between the two publications. First, discussion of important issues related to SEM (e.g., statistical identification, goodness-of-fit, post hoc model fitting) are substantially more extensive in the present volume than in The Primer. Second, whereas The Primer focused solely on first-order CF A models, the present text additionally includes second-order CF A and full SEM models. Specifically, in addition to the six CF A applications illustrated in the earlier book, this new book includes applications of (a) a second-order CFA model, (b) a full SEM model, (c) invariance related to a full SEM model, and (d) a full SEM panel model. Third, in contrast to The Primer, in which no mention was made with respect to categorical data, the present volume provides both a discussion of the related issues and an application wherein the ordinality of the data is taken into account (chap. 5). Fourth, in contrast to the example of an MTMM model presented in The Primer, the application shown in the present text additionally includes the Correlated Uniqueness MTMM model. Fifth, whereas examples in The Primer were based on the LISREL V and VI versions of the program, the present book is based on LISREL 8.14, the most current version as this volume goes to press. Furthermore, to illustrate the various features of the LISREL 8 Windows version, one application (chap. 12) is conducted solely within the Windows environment. Sixth, in light of the development of both the PRELIS program and the SIMPLIS language, subsequent to the writing of The Primer, the present text, as detailed here, provides considerable description and various examples related to each. Seventh, to address any concerns that all applications in the text are based on psycho educational data, references bearing on similar applications pertinent to other disciplines (as noted previously) are provided at the end of each application chapter (chaps. 3-12). Finally, with the exception of two applications (chaps. 8 & 9), all examples presented in the current text differ from those in The Primer. Whereas the application presented in chapter 10 is based on the same data as that used in the earlier book, the procedure for testing
xii
Preface
the invariance of latent means structures using LISREL 8 is substantially different. Although there are now several structural equation modeling texts available, the present book distinguishes itself from the rest in a number of ways. First, it is the only book to demonstrate, by application to actual data, a wide range of confirmatory factor analytic and latent variable models drawn from published studies, along with a detailed explanation of each input and output file. Second, it is the only book to incorporate applications based on both the LISREL 8 and PRELIS 2 programs, and to include example files based on the SIMPLIS language. Third, it is the first book to present an application of a structural equation model using the LISREL 8 program in the Windows environment. Fourth, it is the only book to literally "walk" the reader through (a) the various matrices associated with the LISREL approach to structural equation modeling, (b) the model specification, estimation, and modification decisions and processes associated with a variety of applications, and (c) the various drop-down menus and diagrammer features associated with the Windows version of LISREL 8 and PRELIS 2. Fifth, it is the first book to demonstrate an application of structural equation modeling to the Correlated Uniqueness Multitrait-multimethod model (chap. 5). Finally, the present book contains the most extensive and current description and explanation of goodness-of-fit indices to date.
ACKNOWLEDGMENTS As with any project of this nature, there are many people to whom lowe a great deal of thanks. First and foremost, I am truly indebted to Larry Erlbaum who provided me with an absolutely peerless team of editorial, artistic, and marketing experts. This extremely talented, efficient, and personable cadre of individuals literally performed miracles in getting this book through the production process as if there were no tomorrow. Thank you all for a wonderful experience! On a more personal basis, I wish to thank: Anne Monaghan for keeping me updated on a regular basis regarding the progression of this volume through the many stages of production, and for always being true to her word regarding time lines. If Anne advised me that I could expect to receive certain materials on a specific date, they did indeed arrive!; to Sharon Levy, who not only understood, but also shared my love of bright colors, for making sure that the cover of my present book met that criterion; to Joe Petrowski for spreading the word that this volume is now available; and to Art Lizza for overseeing the entire production of the book.
Preface
xiii
I wish also to extend thanks to my three reviewers: Karl] oreskog, for "setting me straight" on the appropriate language and expression of particular mathematical formulae; Bruce Thompson, for his very thorough reading of the text and his welcome suggestions for expanded explanations regarding particular topics; an anonymous reviewer for very helpful comments and suggestions bearing on the clarity of text. I am most grateful to Darlene Worth for her invaluable assistance in the production of all figures related to the various applications presented in this book, as well as to the students in my 1997 (winter semester) Causal Modeling course, for their careful attention to typographical errors, and textual ambiguities and inconsistencies related to the initial draft of the manuscript. As always, I am truly indebted to my husband, Alex, for his continued support and understanding of the incredible number of hours that my computer and I necessarily spend together on a project of this sort. At this point in time, however, I think that he has given up trying to modify my compulsive behavior that will inevitably lead me into taking on yet another project of similar dimension. Finally, for my many readers who have inquired from time to time about my yellow lab Amy, my constant companion throughout the writing of my last three books, it is with much sadness that I report that, at 12 1/2 years of age, and following my initial submission of the present manuscript, she decided that a good long sleep was in order. I miss her dearly!
PART
ONE
Introduction
CHAPTER 1
Structural Equation Models: The Basics
CHAPTER 2 Using LISREL, PRELIS, and SIMPLIS
CHAPTER
1
Structural Equation Models: The Basics Structural equation modeling (SEM) is a statistical methodology that takes a confirmatory (i.e., hypothesis-testing) approach to the multivariate analysis of a structural theory bearing on some phenomenon. Typically, this theory represents "causal" processes that generate observations on multiple variables (Bentler, 1988). The term structural equation modeling conveys two important aspects of the procedure: (a) that the causal processes under study are represented by a series of structural (i.e., regression) equations, and (b) that these structural relations can be modeled pictorially to enable a clearer conceptualization of the theory under study. The hypothesized model can then be tested statistically in a simultaneous analysis of the entire system of variables to determine the extent to which it is consistent with the data. If goodness-of-fit is adequate, the model argues for the plausibility of postulated relations among variables; if it is inadequate, the tenability of such relations is rejected. Several aspects of SEM set it apart from the older generation of multivariate procedures (see Fornell, 1982). First, as noted earlier, it takes a confirmatory, rather than an exploratory, approach to the data analysis (although aspects of the latter can be addressed). Furthermore, by demanding that the pattern of intervariable relations be specified a priori, SEM lends itself well to the analysis of data for inferential purposes. By contrast, most other multivariate procedures are essentially descriptive by nature (e.g., exploratory factor analysis), so that hypothesis testing is difficult, if not impossible. Second, whereas traditional multivariate procedures are incapable of either assessing or correcting for measurement error, SEM provides explicit esti3
4
Chapter 1
mates of these parameters. Finally, whereas data analyses using the former methods are based on observed measurements only, those using SEM procedures can incorporate both unobserved (i.e. latent) and observed variables. Given these highly desirable characteristics, SEM has become a popular methodology for nonexperimental research, where methods for testing theories are not well developed and ethical considerations make experimental design unfeasible (Bentler, 1980). Structural equation modeling can be utilized very effectively to address numerous research problems involving nonexperimental research; in this book, I illustrate the most common applications (e.g., chapters 3, 4, 7, 9, and 11), as well as some that are less frequently found in the substantive literatures (e.g., chapters 5, 6, 8, 10, and 12). Before showing you how to use the LISREL program ijoreskog & Sorbom, 1993b), along with its companion package PRELIS ijoreskog & Sorbom, 1993c) and second language option SIMPLIS, however, it is essential that I first review key concepts associated with the methodology. We turn now to their brief explanation.
BASIC CONCEPTS Latent Versus Observed Variables In the behavioral sciences, researchers are often interested in studying theoretical constructs that cannot be observed directly. These abstract phenomena ~re termed latent variables, or factors. Examples of latent variables in psychology are self-concept and motivation; in sociology, powerlessness and anomie; in education, verbal ability and teacher expectancy; in economics, capitalism and social class. Because latent variables are not observed directly, it follows that they cannot be measured directly. Thus, the researcher must operationally define the latent variable of interest in terms of behavior believed to represent it. As such, the unobserved variable is linked to one that is observable, thereby making its measurement possible. Assessment of the behavior, therefore, constitutes the direct measurement of an observed variable, albeit the indirect measurement of an unobserved variable (i.e., the underlying construct). It is important to note that the term behavior is used here in the very broadest sense to include scores on a particular measuring instrument. Thus,
Structural Equation Models
5
observation may include, for example, self-report responses to an attitudinal scale, scores on an achievement test, in vivo observation scores representing some physical task or activity, coded responses to interview questions, and the like. These measured scores (i.e., measurements) are termed observed or manifest variables; within the context of SEM methodology, they serve as indicators of the underlying construct that they are presumed to represent. Given this necessary bridging process between observed variables and unobserved latent variables, it should now be clear why methodologists urge researchers to be circumspect in their selection of assessment measures. Although the choice of psychometrically sound instruments bears importantly on the credibility of all study findings, such selection becomes even more critical when the observed measure is presumed to represent an underlying construct. I
The Factor-Analytic Model The oldest and best-known statistical procedure for investigating relations between sets of observed and latent variables is that of factor analysis. In using this approach to data analyses, the researcher examines the covariation among a set of observed variables in order to gather information on their underlying latent constructs (i.e., factors). There are two basic types of factor analyses: exploratory factor analysis (EF A) and confirmatory factor analysis (CFA). We turn now to a brief description of each. EF A is designed for the situation where links between the observed and latent variables are unknown or uncertain. The analysis thus proceeds in an exploratory mode to determine how, and to what extent the observed variables are linked to their underlying factors. Typically, the researcher wishes to identify the minimal number of factors that underlie (or account for) covariation among the observed variables. For example, suppose a researcher develops a new instrument designed to measure five facets of physical self-concept (e.g., Health, Sport Competence, Physical Appearance, Coordination, Body Strength). Following the formulation of questionnaire items designed to measure these five latent constructs, he or she would then
IThroughout the remainder of the book, the terms latent, unobserved, or unmeasured variable are used synonymously to represent a hypothetical construct or factor; the terms observed, manifest, and measured variable are also used interchangeably.
6
Chapter 1
conduct an EF A to determine the extent to which the item measurements (the observed variables) were related to the five latent constructs. In factor analysis, these relations are represented by factor loadings.2 The researcher would hope that items designed to measure health, for example, exhibited high loadings on that factor, albeit low or negligible loadings on the other four factors. This factor analytic approach is considered to be exploratory in the sense that the researcher has no prior knowledge that the items do, indeed, measure the intended factors. (For extensive discussions of EF A, see Comrey, 1992; Gorsuch, 1983; McDonald, 1985; Mulaik, 1972). In contrast to EF A, CF A is appropriately used when the researcher has some knowledge of the underlying latent variable structure. Based on knowledge of the theory, empirical research, or both, he or she postulates relations between the observed measures and the underlying factors a priori, and then tests this hypothesized structure statistically. For example, based on the example cited earlier, the researcher would argue for the loading of items designed to measure sport competence self-concept on that specific factor, and not on the health, physical appearance, coordination, or body strength self-concept dimensions. Accordingly, a priori specification of the CF A model would allow all sport competence self-concept items to be free to load on that factor, but restricted to have zero loadings on the remaining factors. The model would then be evaluated by statistical means to determine the adequacy of its goodness of fit to the sample data. (For more detailed discussions of CFA, see e.g., Bollen, 1989b; Hayduk, 1987; Long, 1983a.) In summary, therefore, the factor analytic model (EFA or CFA) focuses solely on how, and the extent to which the observed variables are linked to their underlying latent factors. More specifically, it is concerned with the extent to which the observed variables are generated by the underlying latent constructs and thus strength of the regression paths from the factors to the observed variables (the factor loadings) are of primary interest. Although interfactor relations are also of interest, any regression structure among them is not considered in the factor analytic model. Because the CF A model focuses solely on the link between factors and their measured variables, within the framework of SEM, it represents what has been termed a measurement model. 2Although the term loading is commonly used in the CFA literature, Thompson and Daniel (1996) have argued that the more specific term of structure coefficient should be used.
7
Structural Equation Models
The Full Latent Variable Model In contrast to the factor analytic model, the full latent variable (LV) model allows for the specification of regression structure among the latent variables. That is, the researcher can hypothesize the impact of one latent construct on another in the modeling of causal direction. This model is termed full (or complete) because it comprises both a measurement model and a structural model; the measurement model depicting the links between the latent variables and their observed measures (i.e., the CFA model), and the structural model depicting the links among the latent variables themselves. A full LV model that specifies direction of cause from one direction only is termed a recursive model; one that allows for reciprocal or feedback effects is termed a nonrecursive model. Only applications of recursive models are considered in this book.
General Purpose and Process of Statistical Modeling Statistical models provide an efficient and convenient way of describing the latent structure underlying a set of observed variables. Expressed either diagrammatically, or mathematically via a set of equations, such models explain how the observed and latent variables are related to one another. Typically, a researcher postulates a statistical model based on his or her knowledge of the related theory, on empirical research in the area of study, or some combination of both. Once the model is specified, the researcher then tests its plausibility based on sample data that comprise all observed variables in the model. The primary task in this model-testing procedure is to determine the goodness of fit between the hypothesized model and the sample data. As such, the researcher imposes the structure of the hypothesized model on the sample data, and then tests how well the observed data fit this restricted structure. (A more extensive discussion of this process is presented in Chapter 3.) Because it is highly unlikely that a perfect fit will exist between the observed data and the hypothesized model, there will necessarily be a discrepancy between the two; this discrepancy is termed the residual. The model-fitting process can therefore be summarized as: Data = Model + Residual
8
Chapter 1
where: Data represent score measurements related to the observed variables as derived from persons comprising the sample Model represents the hypothesized structure linking the observed variables to the latent variables, and in some models, linking particular latent variables to one another Residual represents the discrepancy between the hypothesized model and the observed data In summarizing the general strategic framework for testing structural equation models, Joreskog (1993) distinguished among three scenarios that he termed strictly confirmatory {sq, alternative models (AM), and model generating (MG). In the first instance (the SC scenario), the researcher postulates a single model based on theory, collects the appropriate data, and then tests the fit of the hypothesized model to the sample data. From the results of this test, the researcher either rejects or fails to reject the model; no further modifications to the model are made. In the AM case, the researcher proposes several alternative (i.e., competing) models, all of which are grounded in theory. Following analysis of a single set of empirical data, he or she selects one model as most appropriate in representing the sample data. Finally, the MG scenario represents the case where the researcher, having postulated and rejected a theoretically derived model on the basis of its poor fit to the sample data, proceeds in an exploratory (rather than confirmatory) fashion to modify and reestimate the model. The primary focus here is to locate the source of misfit in the model and to determine a model that better describes the sample data. Joreskog noted that, although respecification may be either theory- or data-driven, the ultimate objective is to find a model that is both substantively meaningful and statistically well fitting. He further posited that despite the fact that "a model is tested in each round, the whole approach is model generating, rather than model testing" (p. 295). Of course, even a cursory review of the empirical literature will clearly show the MG situation to be the most common of the three scenarios, and for good reason. Given the many costs associated with the collection of data, it would be a rare researcher indeed who could afford to terminate his or her research on the basis of a rejected hypothesized model! As a consequence, the SC scenario is not commonly found in practice. Although the AM approach to modeling has also been a relatively uncommon practice, at least two important papers on the topic {e.g., MacCallum, Roznowski, & Ne-
Structural Equation Models
9
cowitz, 1992; MacCallum, Wegener, Uchino, & Fabrigar, 1993) have recently precipitated more activity with respect to this analytic strategy. Statistical theory related to these model-fitting processes can be found (a) in texts devoted to the topic of SEM (e.g., Bollen, 1989b; Hayduk, 1987; Loehlin, 1992; Long, 1983b; Mueller, 1996; Saris & Stronkhurst, 1984; Schumacker & Lomax, 1996), (b) in edited books devoted to the topic (e.g., Bollen & Long, 1993; Hoyle, 1995b; Marcoulides & Schumacker, 1996), and (c) in methodologically oriented journals such as British Journal ofMathemati-
cal and Statistical Psychology, Journal of Educational Statistics, Multivariate Behavioral Research, Psychological Methods, Psychometrika, Sociological Meth0dology' Sociological Methods & Research, and Structural Equation Modeling: A Multidisciplinary Journal.
THE GENERAL LlSREL MODEL As with communication in general, one must first acquire an understanding of the language before being able to interpret the message conveyed. So it is with SEM. Although the researcher wishing to use SEM procedures now has several other computer programs from which to choose (e.g., AMOS-Arbuckle, 1995;EQS-Bentler, 1995;LISCOMP-Muthen, 1988; CALIS-SAS Institute, 1992; RAMONA-Browne, Mels, & Coward, 1994; SEPATHSteiger, 1994), the LISREL program is the most longstanding and widely distributed. (For an extended list and annotated bibliography, see Austin & Calderon, 1996.) Indeed, it has served as the prototype for all subsequently developed programs. Nonetheless, each of these programs is unique in the command language it uses in model specification. In this regard, LISREL stands apart from the other programs in its "bilingual" capabilities. That is, in lieu of using the original Greek language traditionally associated with statistical models, the researcher can opt to specify models using everyday language, as made possible by the SIMPLIS command language. Although I describe the SIMPLIS notation in chapter 2, and provide example input files throughout, the format of this book follows my earlier one (Byrne, 1989) in focusing on the original LISREL notation. To fully comprehend the nature of both CF A and full LV models within the framework of the LISREL program, it is helpful if we first examine a generalized model structure. For didactic purposes only, I have termed this
10
Chapter 1
model a General LISREL model; later in the chapter, I refer to other types of LISREL models. By taking this general model approach, I can then decompose the model into its component parts, which, I believe, will reduce much of the mystery associated with model specification using the LISREL command language. We turn now to this decomposition process.
Basic Composition As noted earlier, the general LISREL model can be decomposed into two sub models: a measurement model, and a structural model. The measurement model defines relations between the observed and unobserved variables. In other words, it provides the link between scores on a measuring instrument (i.e., the observed indicator variables) and the underlying constructs they are designed to measure (i.e., the unobserved latent variables). The measurement model, therefore, specifies the pattern by which each measure loads on a particular factor. It also describes the measurement properties (reliability, validity) of the observed variables ijoreskog & Sorbom, 1989}. The structural model defines relations among the unobserved variables. Accordingly, it specifies which latent variable(s} directly or indirectly influences (i.e., "causes") changes in the values of other latent variables in the model. One necessary requirement in working with LISREL is that in specifying full structural equations models the researcher must distinguish between latent variables that are exogenous, and those that are endogenous. Exogenous latent variables are synonymous with independent variables; they "cause" fluctuations in the values of other latent variables in the model. Changes in the values of exogenous variables are not explained by the model. Rather, they are considered to be influenced by other factors external to the model. Endogenous latent variables are synonymous with dependent variables and, as such, are influenced by the exogenous variables in the model, either directly, or indirectly. Fluctuation in the values of endogenous variables is said to be explained by the model because all latent variables that influence them are included in the model specification.
The Link Between Greek and LlSREL In the Joreskog tradition, the command language of LISREL is couched in the mathematical statistical literature, which commonly uses Greek letters to denote parameters. Despite the availability of SIMPLIS, as noted earlier,
Structural Equation Models
11
it behooves the serious user of LISREL to master the original language. Thus, a second requirement in learning to work with the LISREL program is to become thoroughly familiar with the various LISREL matrices and the Greek letters that represent them. What follows now is a description of these basic matrices. Consistent with matrix algebra, matrices are represented by upper-case Greek letters, and their elements, by lower case Greek letters; the elements represent the parameters in the model. By convention, observed measures are represented by Roman letters. Accordingly, those that are exogenous are termed X-variables, and those that are endogenous are termed Y-variables. As becomes more evident later in the section dealing with the LISREL CF A model, the measurement model may be specified either in terms of LISREL exogenous notation (i.e., X-variables), or in terms of its endogenous notation (i.e., Y-variables). Each of these measurement models can be defined by two matrices and two vectors as follows: 3 one regression matrix relating the exogenous (Ax) (or endogenous [Ay] LVs to their respective observed measures, one vector of latent exogenous (~s) (or endogenous [11s] variables, and one vector of measurement errors related to the exogenous (5s) (or endogenous [ES] observed variables. The structural model can be defined by two matrices and three vectors. These include one matrix of coefficients relating exogenous LVs to endogenous LVs (r), one matrix of coefficients relating endogenous LVs to other endogenous LVs (B), one vector of latent exogenous variables (1;s), one vector of latent endogenous variables (11s), and one vector of residual errors associated with the endogenous LVs (l;s)! Taken together, the general LISREL model can be captured by the following three equations: Measurement Model for the X-variables: x - Ax~ + 0 (1.1) Measurement Model for the Y-variables: y = A y 11 + E (1.2) 3A matrix represents a series of numbers written in rows and columns; each number in the matrix is termed an element. A vector is a special matrix case; a column vector has several rows, albeit one column, whereas a row vector has several columns and only one row. ~ese residual terms represent errors in the equation in the prediction of endogenous factors from exogenous factors; they are typically referred to as disturbance terms to distinguish them from errors of measurement associated with the observed variables.
12
Chapter 1
Structural Equation Model: " - B" + r~ + /; (1.3) In an attempt to clarify the nature of the various components of these equations, each is now described in more detail. Because many readers may feel a little overwhelmed by the use of symbols and letters in these descriptions, I wish first to explain the basic rule by which the subscript numbers are governed. By convention, matrices are defined according to their number of rows and columns; the number of rows is always specified first. This row by column description of a matrix is termed the "order" of the matrix. We turn now to a more specific definition for each of these terms.
The Measurement Model x is a q x 1 vector of observed exogenous variables y is a p x 1 vector of observed endogenous variables ~ is an n x 1 vector of latent exogenous variables " is an m x 1 vector of latent endogenous variables ois a q x 1 vector of measurement errors in x E is a p x 1 vector of measurement errors in y Ax Qambda-x} is a q x n regression matrix that relates n exogenous factors to each of the q observed variables designed to measure them. Ay {lambda-y} is a p x m regression matrix that relates m endogenous factors to each of the p observed variables designed to measure them.
The Structural Model
r (gamma) is an m x n matrix of coefficients that relates the n exogenous factors to the m endogenous factors. B (beta) is an m x m matrix of coefficients that relates the m endogenous factors to one another. /; (zeta) is an m x 1 vector of residuals representing errors in the equation relating " and~. A summary of these matrices, together with their Greek and program notations, is presented in Table 1.1. It is of further importance to note that, in the specification of any LISREL model, the following minimal assumptions are presumed to hold: E is
uncorrelated with"
ois uncorrelated with ~
/; is uncorrelated with ~ /;, E, and 0 are mutually uncorrelated.
13
Structural Equation Models
Table 1.1 summary of Matrices. Greek Notation. and Program Codes Greek Letter
Matrix
Matrix Elements
Program Code
Matrix Type
Measurement Model Lambda-X
Ax
AX
LX
Lambda-Y
Ay
Ay
LY
Regression
Theta delta
96
66
TD
Var/cov
Theta epsilon
9.
6.
TE
Var/cov
Regression
structural Model Gamma
r
y
GA
Regression
Beta
B
p
BE
Regression
PH
Var/cov
'1'
'V"
PS
Var/cov
Phi Psi xi (or Ksi) Eta zeta Var/cov
=
~
Vector
11
Vector
,
Vector
variance-covariance
To comprehend more fully how the Greek lettering system works, and how variables in a specified model may be linked to one another within the context of the LISREL program, let us tum to the pictorial presentation of a simple hypothetical model illustrated in Fig. 1.1. Schematic representations of a models such as this one are termed path diagrams because they provide a visual portrayal of relations that are assumed to hold among the variables under study.
The lISREL Path Diagram
Visible Components By convention, in the schematic presentation of structural equation models, measured variables are shown in boxes and unmeasured variables in circles (or ellipses). Thus, in reviewing the model shown in Fig. 1.1, we see that there are two latent variables (Sh 111)' and five observed variables (XI-Xl;
14
Chapter 1
0,
x,
~,
1..'11
0,
X,
A.",
~,
10,
Y,
10, E,
T], AYZI
1..'31
OJ
Y, 1..,1'
XJ
FIG. 1.1. A LlSREL Structural Equation Model.
Y I- Y2); the XS and Ys function as indicators of their respective underlying latent factors. Associated with each observed variable is an error term (~h-03; EI-E2), and with the factor being predicted (1h), a disturbance (i.e., residual) term (l;I). It is worth noting that, technically speaking, both measurement and residual error terms represent unobserved variables and, thus, they quite correctly could be enclosed in circles. For example, although the disturbance terms (only) are often enclosed in the presentation of path diagrams using the EQS program; (Bentler, 1995), it is not customary to enclose either of these terms in LISREL path diagrams. In addition to the aforementioned symbols that represent variables, certain others are used in path diagrams to denote hypothesized processes involving the entire system of variables. In particular, one-way arrows represent structural regression coefficients and thus indicate the impact of one variable on another. In Fig. 1.1, for example, the unidirectional arrow pointing toward the '111 implies that the exogenous factor ~I "causes" the endogenous factor '111. 5 Likewise, the three unidirectional arrows leading from the exogenous factor (~I) to each of the three observed variables (Xl, X2, X3), suggest that these score values (representing, say, item or sub scale scores of an assessment measure) are influenced by ~I. In LISREL, these regression paths are represented by A; consistent with the exogenous and endogenous specification of the measurement models, these regression coefficients are labeled Ax and Ay, respectively. As such, they represent the magnitude of SIn this text, a cause is a direct effect of a variable on another within the context of a complete model. Its magnitude and direction are given by the partial regression coefficient. If the complete model contains all relevant influences on a given dependent variable, its causal precursors are correctly specified. In practice, however, models may omit key predictors, and may be misspecified, so that it may be inadequate as a causal model in the philosophical sense.
Structural Equation Models
15
expected change in the observed variables for every change in the related latent variable (or factor). The ordering of the subscripts related to these coefficients is keyed to the direction of the arrow. That is, one works from the variable to which the arrow is pointing, to the source variable. Thus 1..31 , for example, represents the regression of X3 on ~I. For sake of clarity in the schematic presentation of all subsequent models, however, these "x" and "y" subscripts will not be explicitly labeled; their designation is implicit in the specification of the particular A matrix modeled. The sourceless one-way arrows pointing from the Os and es indicate the impact of random measurement error on the observed XS and Ys respectively, and from Sh the impact of error in the prediction of '111. Finally, curved two-way arrows represent covariances or correlations between pairs of variables. Thus, the bidirectional arrow linking 01 and 02, as shown in Fig. 1.1, implies that measurement error associated with XI is correlated with that associated with X2• These symbols are summarized in Table 1.2.
Table 1.2 Symbols Associated with Structural Equation Models in LISREL
Symbol
Representation • Unobserved (latent) Factor
• Observed variable
• Path coefficient for regression of observed variable onto unobserved factor
• Path coefficient for regression of one factor onto another
• Residual error (disturbance) in prediction of unobserved factor
• Measurement error associated with observed variable
16
Chapter 1
As noted in the initial paragraph of this chapter, in addition to lending themselves to pictorial description via a schematic presentation of the causal processes under study, structural equation models can also be represented by a series of regression (i.e., structural) equations. Because (a) regression equations represent the influence of one or more variables on another, and (b) this influence, conventionally in SEM, is symbolized by a single-headed arrow pointing from the variable of influence to the variable of interest (i.e., the dependent variable), we can think of each equation as summarizing the impact of all relevant variables in the model (observed and unobserved) on one specific variable (observed or unobserved). Thus, one relatively simple approach to formulating these equations is to note each variable that has one or more arrows pointing toward it, and then record the summation of all such influences for each of these dependent variables. To illustrate this translation of regression processes into structural equations, let us turn again to Fig. 1.1. We can see that there are six variables with arrows pointing towards them; five represent observed variables (XI-X3; Y I - Y 2), and one represents an unobserved variable (or factor; TIl). Thus, we know that the regression functions symbolized in the model shown in Fig. 1.1 can be summarized in terms of six separate regression equations:
Sl Axil + 01 Ax12 + 02 Ax13 + 03
111 ... /;1 +
XI
=
X2 -
X3 ...
YI ... A.y11 Y2 "" A.y12
(1.4)
+ &1
+ &2
Nonvisible Components Although, in principle, there is a one-to-one correspondence between the schematic presentation of a model and its translation into a set of structural equations, it is important to note that neither one of these model representations tells the whole story; some parameters critical to the estimation of the model, are not explicitly shown and thus may not be obvious to the novice structural equation modeler. For example, in both the path diagram and the equations presented earlier, there is no indication that the variances of the independent variables are parameters in the model; indeed, such parameters are essential to all structural equation models. Thus the researcher must be mindful of this inadequacy of path diagrams when specifying his or her model input.
Structural Equation Models
17
Considering the other side of the coin, however, it is equally important to draw your attention to the specified nonexistance of certain parameters in the model. For example, in Fig. 1.1, we detect no curved arrow between £1 and £2, which suggests the lack of covariance between the error terms associated with the observed variables Y1 and Y2• Similarly, there is no hypothesized covariance between T11 and ~I; absence of this path addresses the common, and most often necessary assumption that the predictor (or exogenous) variable is in no way associated with any error arising from the prediction of the criterion (or endogenous) variable. The core parameters of concern in structural equation models are the regression coefficients, and the variances and covariances of the independent variables. 6 However, given that sample data comprise observed scores only, there needs to be some internal mechanism whereby the data are transposed into parameters of the model. This task is accomplished via a mathematical model representing the entire system of variables. Such representation systems can, and do, vary with each SEM computer program. (For a comparison of the LISREL and EQS representation systems, for example, see Bentler, 1988.) Because adequate explanation of the way in which the LISREL representation system operates demands knowledge of the program's underlying statistical theory, the topic goes beyond the aims and intent of the present volume. Thus, readers interested in a comprehensive explanation of this aspect of the LISREL approach to the analysis of covariance structures are referred to the following texts (Bollen, 1989b; Hayduk,1987; Saris & Stronkhorst, 1984) and monographs aoreskog & Sorbom, 1979; Long, 1983b). Finally, an important corollary of SEM is that the variances and covariances of dependent (or endogenous) variables, whether they be observed or unobserved, are never parameters of the model; rather, they remain to be explained by those parameters that represent the independent (or exogenous) variables in the model. In contrast, the variances and covariances of independent variables are important parameters that need to be estimated.
THE LlSREL CONFIRMATORY FACTOR·ANALYTIC MODEL In an earlier section of this chapter entitled "Basic Concepts," I presented a brief comparison of exploratory and confirmatory factor analyses. In describ60ther parameters may also include the means and intercepts (see chapter 10).
18
Chapter 1
ing the CF A model, I noted that it focuses solely on relations between the observed variables and their underlying factors. Because the CFA model is concerned only with the way in which observed measurements are mapped to particular factors, and not with causal relations among factors, it is termed a measurement model; as noted earlier, it represents only a portion of the general LISREL model. Specification of a LISREL CF A model requires the researcher to choose between one that is exogenously oriented and one that is endogenously oriented. Although the choice is purely an arbitrary one, once the decision is made within the framework of one or the other, the specification of all parameters must be consistent with the chosen orientation. 7 In other words, in working with a CFA model using LISREL, the researcher must specify either an all-X (i.e., exogenous) or an alloY (i.e., endogenous) model. Although some researchers prefer to work within the alloY model (see e.g., Marsh, 1987), most work within the framework of the all-X model. For sake of didactic purposes in clarifying this important distinction, let us now examine Fig. 1.2, in which the same model presented in Fig. 1.1 has been demarcated into measurement and structural components. Considered separately, the elements modeled within each rectangle in Fig. 1.2 represent two CFA models-one exogenous (all-X model), and one endogenous (all-Y model). The enclosure of the two factors within the ellipse represents a structural model and, thus, is not of interest in CFA research. The CFA all-X model represents a one-factor model (1;1) measured by three observed variables (Xl-X 3), whereas the CFA alloY model represents a one-factor model (111) measured by two observed variables (y1-Y2). In either case, the regression of the observed variables on the factor, and the variances of both the errors ~f measurement and the factor, as well as the error covariance, are of primary interest. Although both CF A models described in Fig. 1.2 represent first-order factor models, second- and higher order CF A models can also be analyzed using LISREL. First-order factor models are those in which correlations among the observed variables (i.e., the raw data) can be described by a smaller number of latent variables (i.e., factors), each of which may be considered to be one level, or one unidimensional arrow away from the observed variables (see Fig. 1.2); these factors are termed 7Technically speaking, because there is no specification of causal relations among the factors in CFA models, there is no specific designation of variables as either exogenous or endogenous. Relatedly, S, the residual term associated with the prediction of one factor from another, will necessarily be zero thereby making the all·X and alloY models equivalent. This, then accounts for the arbitrariness of the decision.
19
Structural Equation Models
MEASUREMENT (CFA)MODEL MODEL MEASUREM ENT (CFA)
X,X, Y,
_ E,
X,X, Y,
I- ~
X,X,
STRUCTURALMODEL MODEL STRUCTURAL FIG. 1.2. A USREl Structural Equation Model demarcated into its Measurement and Structural Components.
primary or first-order factors. Second-order factor models are those in which correlations among the first-order factors, in turn, can be represented by a single factor, or at least a smaller set of factors. Relatedly, one can think of these higher order factors as being two levels, or two unidimensional arrows away from the observed variables; hence the term second-order factor. Likewise, if correlations among the second-order factors can be represented by yet another factor (or smaller set of factors), a third-order model exists, and so on. Although Kerlinger (1984) has deemed such hierarchical CFA models to be less commonly found in the literature, second-order CF A models are beginning to appear more frequently in the methodological journals. Discussion and application of CF A models in the present book are limited to firstand second-order models only. (For a more comprehensive discussion and explanation of first- and second-order CFA models, see Bollen, 1989b; Kerlinger, 1984.) We turn now to an explanation of the first-order CFA model within the framework of the LISREL program.
Chapter 1
20
First-Order CFA Model As demonstrated in the previous section, structural equation models can be depicted both diagrammatically and as a series of equations. Furthermore, the equation format can be expressed either in matrix form or as a series of regression statements. Using a simple two-factor model, let us now reexamine the LISREL CF A model within the framework of each of these two formats; we focus here on a first-order factor model. To help you in acquiring a thorough understanding of all matrices and their related elements, and because examples of both are found in the literature, this model will be expressed first as an all-X model, and then as an all-Y model.
The AII-X CFA Model Suppose that we have a two-factor model of self-concept. Let the two factors be academic self-concept (ASC) and social self-concept (SSC). Suppose that each factor has two indicator variables. s Let the two measures of ASC be scores from the Academic Self-Concept subscale of the Self Description Questionnaire-I (SDQASC; Marsh, 1992) and the Scholastic Competence sub scale of the Self-Perception Profile for Children (SPPCASC; Harter, 1985). Let the two measures of SSC be scores from the Peer Relations sub scale of the SDQ-I (SDQSSC) and the Social Acceptance sub scale of the SPPC (SPPCSSC). This model is presented schematically in Fig. 1.3. Shown here then, is a two-factor model representing the constructs of ASC and SSC, with each factor having two indicator variables. These two observed measures for ASC are SDQASC and SPPCASC; for SSC they are SDQSSC and SPPCSSC. Associated with each observed measure is an error term. The curved two-headed arrow indicates that the two factors, ASC and SSC, are correlated. Now let us translate this model into LISREL notation and reexamine it both diagrammatically and as a set of equations. We will focus first on the all-X CFA model and then turn our attention to the all-Y CFA model. Fig. 1.4, therefore, represents a schematic translation of the hypothesized model of self-concept with LISREL notation specific to the all-X formulation.
SIt is now widely recommended that at least three observed variables should be used as indicators of the underlying constructs (see, e.g., Anderson & Gerbing, 1984; Bentler & Chou, 1987; Wothke, 1993). However, for didactic purposes in the examination of a simple model, I have limited the number of indicators to two.
Structural Equation Models
error
21
SDQASe Ase
error
SppeASe SPPCASC
error
SDQSSe sse
error
sppesse
FIG. 1.3. Hypothesized First-order (FA Model.
01
Xl
1..11 ~l
O2
X2
A2l 21
03
X3
1..32 ~2
04
X.
1..42
FIG. 1.4. Hypothesized First-order (FA Model with LlSREL AII-X Model Notation. From A Primer o( LlSREL: Basic Applications and Programming (or Confirmatory Factor Analytic Models (p. 11), by B. M. Byrne, 1989, New York: Springer-Verlag. Copyright 1989 by Springer-Verlag. Reprinted with permission.
In reviewing Fig. 1.4, we see that the factors ASC and SSC are represented by ~1 and ~2' respectively; their correlation is represented by '21. The parameters 1..11 and 1..21 represent the regression of XI (SDQASq and X2 (SPPCASq on ~I (ASq, and 1..32 and 1..42 , the regression of Xl (SDQSSq and
22
Chapter 1
x. (SPPCSSC) on ~2
o.
(SSC), respectively. Finally, 01 to represent errors of measurement associated with XI to X., respectively. We caI.l now decompose the LISREL model in Fig. 1.4 into its related regression equations as: XI -XI
1.t. 1l 1~1~1
X2 -.. 1.t. 21~1 X2
x, == 1.t. 12~2 x. - A.t. .2~2~2
+ 0311 322 +O + O, 3, 3. + 04
(1.5)
This set of equations can be captured by a single expression that describes relations among the observed variables (xs), the latent variables (~s), and the errors of measurement (os); shown here as Equation 1.6, this expression represents the general LISREL factor analytic model (for the all-X CFA model): t..ll~1
XI
-
X2
..
x, x.
=
t..21~1 t..12~2
-
t...2~2
x - A.~
+0
+ + + +
31 32 3, 3.
(1.6)
The parameters of this model are Ax, , and 05 where: Ax
el)
represents the matrix of regression coefficients related to the ~s (described earlier). (Phi) is an n x n symmetrical variance-covariance matrix among the n exogenous factors. (theta-delta) is a symmetrical q x q variance-covariance matrix among the errors of measurement for the q exogenous observed variables.
Consistent with the model in Fig. 1.4 and the series of regression statements in Equation 1.5, this general factor analytic model can be expanded as:
A.
x
0
~
XI
1..11
0
X2
~I
0
X,
0
1..12
X4
0
1..42
01
~J
+
02
O,
(1.7)
04
In reviewing Equation 1.7 which represents the regression equations in matrix format, it is now easy to follow the direct transposition of parameters
Structural Equation Models
23
from the model depicted in Fig. 1.4. Further elaboration of the regression matrix (Az), however, is in order. Accordingly, I wish to note four features of this matrix that may be helpful to you in your understanding of these equations. First, consistent with the number of factors (;s) in the model, there are two columns-one representing each factor. Second, recall that in interpreting the double subscripts, the first one represents the row, and the second one the column. From a statistical perspective, 1.,11 for example, indicates an element in the first row and first column of the matrix; from a substantive perspective, it represents the regression coefficient for the first observed variable (Xl) on Factor 1 (;1); likewise, 1.,21 represents the second indicator (X2) of Factor 1. Both of these parameters therefore appear in column 1; in contrast, those representing observed variables Xl and X., the indicators of Factor 2 (;2), are shown in column 2. Third, the zeros are fixed values indicating that, for example, Xl and X 2 are specified to load on Factor 1 and not on Factor 2; the reverse holds true for Xl and x.. In sum, then, the AS represent parameters to be estimated, whereas the Os represent parameters whose values have been fixed to zero a priori. Finally, the A matrix is often termed the factor-loading matrix because it portrays the pattern by which each observed variable is linked to its respective factor; it is classified as a "full" matrix in terms of LISREL language. Information bearing on types of matrices is important in the specification of models in LISREL, and constitutes a topic that we review in Chapter 2. As presented in Equation 1.7, both the factors (;s) and the measurement error terms (8s) were expressed as vecfors. Recall, however, that in addition to the regression coefficients, the core parameters in any structural equation model also include the variances and covariances of the independent variables. Although it is easy to see that the;s in the model represent independent latent variables, this designation for the 8s is likely less obvious (and really not of any consequence, other than to make a point here); technically speaking, given that they impact on the observed variables (i.e., the arrow points from the error term to the observed variable), they may be regarded as independent variables. Overall, then, we will want to estimate the variances for both factors, as well as for the measurement errors.9 In addition, because the two factors are related, their covariance also needs to be estimated. 9Typically, the specification of structural equation models assumes no relations among the errors of measurement a priori. Such parameters, however, can often lead to major misspecification of a model; their presence is determined from the application of post hoc model-fitting procedures, a topic that is discussed at various points throughout the book.
24
Chapter 1
To illustrate these variance-covariance parameters more explicitly, it is helpful to further expand the matrix equation shown in Equation 1.7 to include all elements in the related variance-covariance matrices. As noted earlier, variances and covariances of the ~s are represented in the matrix; those for the os are represented in the e B matrix. Both variance-covariance matrices are classified as being symmetric. This expansion is presented in Equation 1.8: x XI
Au
X2
A21
X3
0 0
X4
Ax 0 0 An A42
eu
,U '21
'22
+
0 0 0
as e2222
9o 1.8 933 (1.8) 0 e o0 0 9e44
As with all variance-covariance matrices, the variances are represented on the diagonals and the covariances on the off-diagonals. Thus, ~11 and ~22 represent the variance of Factors 1 (~I) and 2 (~2)' respectively; ~21 represents their covariance. Likewise, 9 11 -9 44 represent the variance of each error term associated with the four observed variables (XI-X.).IO Because the model specified no error covariances, each of the off-diagonal elements in the es matrix is fixed to zero. On the other hand, had the presence of error covariances (commonly termed correlated errors) been known a priori, then the appropriate off-diagonal parameter would have been specified as free (i.e., estimable).
The AII-Y CfA Model Now let us review this same two-factor model expressed as an all-Y LISREL first-order CF A model. Presented first is the schematic presentation of the model (Fig. 1.5), followed by the set of four regression statements (Equation 1.9) and the general LISREL factor model (for the all-Y CFA model; Equation 1.10). The parameters of this model are Ay, '1', and e£ where:
10As with the x/y designation of the A. parameters noted earlier, the subscript for each e parameter could be preceded by a "B" to distinguish it from the e parameters in the 0 matrix. However, in the interest of clarity and neatness, these additional symbols are not included.
25
Structural Equation Models
Ay represents the matrix of regression coefficients related to the 'I1'S (described earlier) \f' (psi) is an m x m symmetrical variance-covariance matrix among the m residual errors for the m endogenous factors. ®. (theta-epsilon) is a symmetrical p x p variance-covariance matrix among the errors of measurement for the p endogenous observed variables.
The regression equations expressed in matrix format are presented in Equation 1.11, followed by the expanded matrix equation (Equation 1.12): YI - A.II'I11 Y2 - A.21TlI y) - A.nTl2 Y4 = A.42Tl2 t..ll~1
XI
-
X2
..
x, x.
=
t..21~1 t..12~2
-
t...2~2
+ + + +
£1 £2 £) £4
+ 31 + 32 + 3, + 3.
(1.9)
(1.10)
A'l
y, YI
£0,l
Y2 Y2
£02
Y33 Y
£03 3
Y Y.4
£04
ll, '111
1..21 A.21
'JI21
1..32 112 '112 1..42 A.'2
FIG. 1.5. Hypothesized First-order CFA Model with lISREL AII-Y Model Notation. From A Primer of L1SREL: Basic Concepts, Applications, and Programming (p. 14), by B. M. Byrne, 1989, New York: Springer-Verlag. Copyright 1989 by Springer-Verlag. Reprinted with permission.
26
Chapter 1
y
Ay
YI
A.II
Y2
A.21
Yl
0 0
_Y4
y YI Y2
A.21
Yl
0 0
y4
TIl
A.32
Tl2
A.42
Ay A.II
TJTI
0 0
\jill
A.32
\jI21
A..2
911
+ \jI22
l~]
82 83
1.11
84
e.
'I' 0 0
E8
81
0 0 0
1.12 992222 (1.12) o0 93) 933 o 0 9..44 0
Chances are you were doing just fine until you reached Equation 1.12, at which point you likely wondered where on earth the Psi ('¥) matrix came from. In fact, the variance-covariance matrix for the residual terms (l;s) is represented in the '¥ matrix. However, because the LISREL CF A model does not include a causal path between the two factors (e.g., ~I ~ TJI)' the residual term is nonexistant (i.e., equal to a value of zero). Thus, variances and covariances for the TJ factors in the all-Y model are estimated in the \II matrix.
One Final Model To finalize our examination of the first-order CFA model, and to allow me to introduce you to the issue of statistical identification, let us now take a brief look at another slightly more complex model that incorporates four factors, each of which has three indicator variables. A schematic presentation of this model is shown in Fig. 1.6. Considered as an all-X model, this CF A structure comprises four self-concept (SC) factors- academic SC (ASC; SI)' social SC (SSC; S2)' physical SC (PSC; S3)' and emotional SC (ESC; S4). Each SC factor is measured by three observed variables (XI-X3; X4-X6; X7-X9; XIO-XI2, respectively). The reliability of each of these indicators is influenced (as indicated by the ~s) by random measurement error (ol-lh; 04-06; 0,09; 010-012, respectively). Each of these observed variables is regressed onto its respective factor (1. 11-1.31; ~2--~2; A.73-M3; A.lorA.12,4, respectively). Finally, the four factors are shown to be intercorrelated (cjl2h cjl3h cjl4h cjl32, cjl42, cjl43).
S, S, S,
SDQASC I
x,
I.(}
SDQASC2
x,
~DQASC
x,
1." )...,,>
ASC ~,
3
"
SDQSSCI
S,
x,
S,
SDQSSC2
x,
10
A" 'I- ~
SSC ~,
"
SDQSSC3
S,
x,
. t. l ~1 t. 21~1 t. 12~2
t. 2~2
+ 31 + 32 + 3, + 3.
+ 31 + 32 + 3, + 3.
t. l ~1 t. 21~1 t. 12~2
t. 2~2
+ 31 + 32 + 3, + 3.
..
SDQPSC I
x,
SDQPSC2
x,
I.(}
A"
PSC ~,
"
SDQPSC3 X,
$" ~.
~" ~ , .
SDQESCI X"
SDQESC2
x" SDQESC3
10 ). ,~
'1-'"
ESC
I;.
x"
FIG. 1.6. Hypothesized First-order CFA Model with lISREL Notation.
27
28
Chapter 1
As noted earlier, the key parameters to be estimated in a CF A model are the regression coefficients (i.e., factor loadings), the factor and error variances, and in some models (as is the case here), the factor covariances. One extremely imponant caveat, in working with structural equation models, is to always tally the number of parameters in the model to be estimated prior to running the analyses. This information is critical to your knowledge of whether or not the mo.del that you are testing is statistically identified, a topic to which I turn next. As a prerequisite to the discussion of identification, however, let us first count the number of parameters to be estimated for the model ponrayed in Fig. 1.6. From a review of the figure, we can ascenain that there are 12 regression coefficients (A.s), 16 variances-12 error variances (os) and 4 factor variances (ejls), and 6 covariances (ejl21-ejl43). However, you will also note that the first of each set of A. parameters has been fixed to a value of 1.00 (they are therefore not to be estimated). The rationale underlying this constraint is tied to the issue of statistical identification. In total, then, there are 30 parameters to be estimated for the CF A model depicted in Fig. 1.6. Let us now turn to a brief discussion of this imponant concept.
The Issue of Statistical Identification Statistical identification is a complex topic that is difficult to explain in nontechnical terms. Although a thorough explanation of the identification principle exceeds the scope of the present book, it is not critical to the reader's understanding and use of the book. Nonetheless, because some insight into the general concept of the identification issue will undoubtedly help you to better understand why, for example, panicular parameters are specified as having cenain fixed values, I attempt now to give you a brief, nonmathematical explanation of only one aspect of this concept-the measurement issue relative to the latent variables. Essentially, I address the so-called "t-rule," one of several tests associated with identification. Readers are urged to consult the following texts for a more comprehensive treatment of the topic: Bollen (1989b); Hayduk (1987); Long (1983a, 1983b); Saris and Stronkhorst (1984). In broad terms, the issue of identification focuses on whether or not there is a unique set of parameters consistent with the data. This question bears directly on the transformation of the variance-covariance matrix of observed variables (the data) into the structural parameters of the model under study. If a unique solution for the values of the structural parameters can be found, the model is considered to be identified, and the parameters are therefore estimable and the
Structural Equation Models
29
model testable. H, on the other hand, a model cannot be identified, it indicates that many sets of very different parameter estimates could fit the data equally well and, in this sense, anyone set of values would be arbitrary. Structural models may be just-identified, overidentified, or underidentified. A just-identified model is one in which there is a one-to-one correspondence between the data and the structural parameters. That is to say, the number of data variances and covariances equals the number of parameters to be estimated. However, despite the capability of the model to yield a unique solution for all parameters, the just-identified model is not scientifically interesting because it has no degrees of freedom and therefore can never be rejected. An overidentified model is one in which the number of estimable parameters is less than the number of data points (i.e., variances, covariances of the observed variables). This situation results in positive degrees of freedom that allow for rejection of the model, thereby rendering it of scientific use. The aim in SEM, then, is to specify a model such that it meets the criterion of overidentification. Finally, an underidentified model is one in which the number of parameters to be estimated exceeds the number of variances and covariances. As such, the model contains insufficient information (from the input data) for the purpose of attaining a determinate solution of parameter estimation; that is, an infinite number of solutions are possible for an underidentified model. Reviewing the CF A model in Fig. 1.6 again, let us now determine how many data points with which we have to work (i.e., how much information do we have with respect to our data?). As noted earlier, these constitute the variances and covariances of the observed variables; with p variables, there are pCp + 1)/2 such elements. Because there are 12 observed variables, this means that we have 12(12 + 1)/2 - 78 data points. Prior to this discussion of identification, we determined a total of 30 unknown parameters. Thus, with 78 data points and 30 parameters to be estimated, we have an overidentified model with 48 degrees of freedom. It is important to point out, however, that the specification of an overidentified model is a necessary, but not sufficient condition to resolve the identification problem. Indeed, the imposition of constraints on particular parameters can sometimes be beneficial in helping the researcher to attain an overidentified model. An example of such a contraint is illustrated in chapter 5 with the application of a second-order CF A model. Linked to the issue of identification is the requirement that every latent variable have its scale determined. This requirement arises because these variables are unobserved and therefore have no definite metric scale; this can
30
Chapter 1
be accomplished in one of two ways: The first approach is tied to specification of the measurement model whereby the unmeasured latent variable is mapped onto its related observed indicator variable. This scaling requisite is satisfied by constraining to some nonzero value (typically 1.0), one factor loading parameter (A) in each set of loadings designed to measure the same factor. This constraint holds for both the independent (~) and dependent (1'\) latent variables. In other words, in reviewing Fig. 1.6, this means that for one of the three regression paths leading from each SC factor to its set of observed indicators, some fixed value should be specified; this observed variable is termed a reference variable. 11 The second way to establish the scale of the latent factors is to standardize them directly thereby making their units of measurement equal to their population standard deviations. This approach eliminates the need for a reference variable in the A matrix; all factor loading parameters are therefore freely estimated. Joreskog and Sorbom (1993a, 1993b) claimed that standardization of the latent variables is the more useful and convenient way of assigning the units of a measurement to the latent variables. Thus, by default, LISREL 8 automatically standardizes both the independent and dependent latent factors, if no reference variable is specified for the measurement model. (This default represents a new feature in LISREL 8; only standardization of the independent latent factor, 1;, was possible with LISREL 7.) If, on the other hand, a reference variable is assigned to a particular latent factor, its variance will automatically be estimated by the program. It is important to note, however, that for reasons linked to statistical identification, it is not possible to estimate all factor loading regression coefficients and the variance related to the same factor; one must either constrain one parameter in a set of regression coefficients or the factor variance. With respect to the model shown in Fig. 1.6, the scale has been established by constraining to a value of 1.0, the first A parameter in each set of observed variables. From the schema presented in Fig. 1.6, we can now summarize the structural regression portion of the model as a series of equations: 1.00 ~I + 0 1 x. - 1.00 ~2 + o. X7 1.00~) + 07 XIO - 1.00~. + 010
XI -
X2 -
Xs -
1..21 ~I + 02 A.s2 ~2 + 05
x. - AIl~) Xl1 -
A l1 ,.~.
+ O. + 011
x) -
1..)1
~I
X6 -
1..62
~2
X9-~)~)+09 XI2 -
A 12 ,.~.
+ +
0) 06
+
(1.13)
012
11 Although the decision as to which parameter to constrain is purely an arbitrary one, the measure having the highest reliability is recommended, if this information is known; the value to which the parameter is constrained, is also arbitrary.
Structural Equation Models
31
Second-Order CFA Model In our previous factor analytic model, we had four factors (ASe, sse, pse, ESC) that operated as independent variables; each could be considered to be one level, or one unidirectional arrow away from the observed variables, and therefore served as first-order factors. However, it may be the case that the theory argues for some higher level factor that is considered accountable for the lower order factors. Basically, the number of levels, or unidirectional arrows that the higher order factor is removed from the observed variables determines whether a factor model is labeled as being second-order, third-order or some higher order; as noted earlier, only a second-order model will be considered here. Accordingly, let us examine the pictorial representation of this model in Fig. 1.7. Although this model has essentially the same first-order factor structure as the one shown in Fig. 1.6, it differs in that a higher order general self-concept (GSC) factor is hypothesized as accounting for, or explaining all of the covariances among the first-order factors. As such, general se is termed the second-order factor. It is important to take particular note of the fact that general se does not have its own set of measured indicators; rather, it is linked indirectly to those measuring the lower order factors. Let us now take a closer look at the parameters to be estimated for this second-order model. Because LISREL assigns different notation to parameters that are estimated in all-X versus all-Y measurement models, the factorial structure portrayed in Fig. 1.7 is readily distinguishable from the one shown in Fig. 1.6, despite the fact that the first-order structure is exactly the same. Indeed, several distinctive features of this second-order model set it apart from the first-order model shown in Fig. 1.6. Let us now take a closer look at these characteristics. It is likely that the first thing you noticed about the model in Fig. 1.7 was the change in notation for the model parameters. This modification reflects the fact that the four se factors (ASe, sse, pse, ESC) are now dependent, rather than independent variables in the model because they are presumed to be explained by the higher order factor of general se; the latter therefore serves as the one and only independent factor (SI). Relatedly, because the first-order se factors operate as dependent variables, the notation of their observed indicators and measurement errors is consistent with an all-Y model.
e,
Y1 Y,
SDQASC2
YZ Y,
e, .
SDQASC3
e, £4
SDQSSCI
£, e, £6
"
SDQASCI
'0
l."
l-+---=.L- i
'"',
ASC ~ ,
Y3 Y,
Y, Y.
SDQSSC2
Ys Y,
'"
l."
.,.".
"
1"
SSC 1],
SDQSSC3
1"
Y6 Y,
GSC
SDQPSCI SDQPSCI SDQPSCI
101() E.
lO ll E"
E,
"
SDQPSCII SDQPSC
Y7 Y,
SDQPSC2
Yg Y,
SDQPSC3
'0
l..
I+----""-- i
'1--'
~ ,
1"
"
SDQESCI SDQESCI SDQESC2
YII Y"
SDQESC3
1"
PSC
Y9 Y,
Y10 Y"
/;,
'0 l.,"
.,..'
ESC ~ ,
Y,z Y"
FIG. 1.7. Hypothesized Second-order CFA Model with lISREl Notation.
32
Structural Equation Models
33
Relative to the altered parameter notation, a second feature of this model is the presence of single-headed arrows leading from the second-order factor (~I) to each of the first-order factors (111-11.). These regression paths ('YI1, 'Y2h 12 'Y3h 'Y41) represent second-order factor loadings, and all are freely estimated. Recall, however, that for reasons linked to the statistical identification issue, a constraint must be placed either on one of the regression paths, or on the variance of an independent factor; both parameters cannot be estimated. Because the impact of general SC on each of the lower-order SC factors is of primary interest in second-order CFA models, the variance of the higher-order factor is necessarily constrained to equal 1.0. A third aspect of this second-order model that perhaps needs amplification is the initial appearance that the first-order factors operate as both independent and dependent variables. This, however, is not so; variables can serve either as independent or as dependent variables in a model, but not both.13 Because the first-order factors function as dependent variables, this necessarily means that their variances and covariances are no longer estimable parameters in the model; indeed, such variation is presumed to be accounted for by the higher order factor. In comparing Figs. 1.6 and 1.7, then, you will note that there are no longer two-headed curved arrows linking the first-order SC factors, thereby indicating that neither their factor covariances nor variances are to be estimated. Finally, the prediction of each of the first-order factors from the secondorder factor is presumed not to be without error. Thus, a residual error term is associated with each of the lower level factors (l;I-l;.). As a first step in determining whether this second-order model is identified, we now sum the number of parameters to be estimated; as such we have: 8 first-order regression coefficients (A2h A3h AS2, ~, AS3, A93, AU,., AI2,.), 4 second-order regression coefficients ('YI1-'Y44), 12 measurement error variances (EI-E12), and 4 residual error terms (l;I-l;.), for a total of 28. Given that there are 78 pieces of information in the sample variance-covariance matrix, we conclude that this model is overidentified with 50 degrees of freedom.
12& noted earlier, with respect to the first-order regression coefficients, the first number of the subscript represents the dependent variable whereas the second one represents the independent variable. 13In SEM, once a variable has an arrow pointing at it, thereby targeting it as a dependent variable, it maintains this status throughout the analyses.
34
Chapter 1
Before leaving this identification issue, however, a word of caution is in order. With complex models in which there may be several levels of latent variable structures, it is wise to visually check each level separately for evidence that identification has been attained. For example, although we know from our initial CF A model that the first-order level is identified, it is quite possible that the second-order level may indeed be underidentified. Because the first-order factors function as indicators of (i.e., the input data for) the second-order factor, identification is easy to assess. In the present model, we have four factors, thereby giving us 10 [4 x 5]/2) pieces of information from which to formulate the parameters of the higher order structure. According to the model depicted in Fig. 1.7, we wish to estimate 8 parameters, thus leaving us with 2 degrees of freedom, and an overidentified model. However, suppose that we only had three first-order factors. We would then be left with a just-identified model as a consequence of trying to estimate six parameters from six (3[3 + 1]/2) pieces of information. In order for such a model to be tested, additional constraints would need to be imposed. Finally, let us suppose that there were only two first-order factors; we would then have an underidentified model because there would be only three pieces of information, albeit four parameters to be estimated. Although it might still be possible to test such a model, given further restrictions on the model, the researcher would be better advised to reformulate his or her model in light of this problem (see Rindskopf & Rose, 1988). Considering the model in Fig. 1.7, let us now write the series of regression statements that summarize its configuration. As such, we need to address two components of the model-the higher order factor structure (represented by a structural model in the LISREL sense), and the lower order factor structure (represented by the measurement model in the LISREL sense). These are: measurement YI y. Y7 Ylo -
measurement
1.00 111 + EI 1.00 112 + E. 1.00 11l + E7 1.00 11. + Elo
Y2 ys = ya Y11 -
measurement
A21 111 + E2 AS2 112 + E5 Asl 11l + Ea A11,. 11. + E11
Yl y6 y9 = Yl2 =
measurement
All 111 + El ~ 112 + E6 ~l 11l + E9 All,. 11. + Ell
(1.14)
Just as we were able to capture the essence of the regression statements describing the first-order CF A model in a single matrix equation (see Equation 1.10), so we can for this second-order CF A model. Consistent with
35
Structural Equation Models
the two levels of factor structure in this model, however, the matrix equation necessarily comprises two separate statements. We turn first to the higher order structure which can summarized as:
..,11 ...- r~r~
++ ~14~14
(1.15) 1.15
Because Equation 1.15 representing the structural model is somewhat tricky due to the change in notation related to the ~s, let us review its expansion in two stages-expressed first as a set of vectors, and expressed second in its expanded matrix form. We turn now to the equation in vector form.
..,11
r
n1
111
n2
121
n3
131 ,
n.
'Y41
[~
~
~ ~1 I
~I
SI +
~2
~3 .~4
1.16 (1.16)
Before rewriting this equation to show the expansion of the ~ vector, it is worthwhile to note two important points. First, as shown earlier for the A matrix, r is also a regression matrix and, as such, can be classified as a full matrix. However, within the context of a higher order CF A model, it serves as a special case where there is only one ~, albeit more than one 11. As a consequence, its elements remain described in vector form. To understand better how this works, recall that the order of the r matrix is m x n, where m represents the number of endogenous (11S) factors, and n, the number of exogenous (~s) factors; in addition, m x n represents a row x column orientation. Thus, with respect to the model shown in Fig. 1.7, we know that the r matrix will be described by four rows (the 11S) and one column (the ~) thereby resulting in a vector. The second point worthy of note pertains to the residual disturbance term which, as shown on the diagram (Fig. 1.7), and in Equation 1.16, is represented by the Greek letter zeta (~). This variable, however, is estimated in
14Had the model involved a third level (i.e., a third-order CFA model), then the equation would involve the regression of the second-orderfactors on the third-orderfactor, which would then have involved the Beta matrix; the equation would then be: 1'\ - B1'\ + r~ + (,. However, this expansion goes beyond the intent of the present book.
36
Chapter 1
the Psi ('I') matrix; thus, the elements of this matrix are 'l's, and not l;s. We turn now to Equation 1.17 showing the expanded '¥ matrix:
11 111 112 113 114
r 111 121 131 141
'P
~ \jill
~1
+
0 0 0
\jill '1'22
o0 o0
\jI33 '1'33
]
0
(1.17) 1.17
\jI44 '1'44
Turning to the lower order factor structure, we see first the summary matrix equation (Equation 1.18), followed by its expansion (Equation 1.19) showing the elements of each matrix. However, due to space restrictions, and because the error variance-covariance matrix (aE) was illustrated for the first-order CF A model, measurement error in this equation is represented here only in vector form (E). By now, this equation should look familiar to you because it follows the same pattern as the first-order CF A structure based on an all-Y model: Y-Ayl1+ E
(1.18)
Y YI
A.11
Y2
A.21
y3
A.31
y ..
0 0 0 0 0 0 0 0 0
y5 y6 Y7 ys y9 YIO Yl1 Yl2
E
Ay
0 0 0 A.42 A.52 A.62
0 0 0 0 0 0
0 0 0 0 0 0 A.73 A.S3 A.93
0 0 0
0 0 0 0 0 0 0 0 0
EI E2 E3 E4
[~
E5 +
E6
(1.19)
E7 Es E9
A.Io,4
EIO
A.l1,4
El1
Yl2,4
EI2
In summary, and for sake of completion, I wish to note that the general LISREL equation for a second-order CF A model derives from a combination
37
Structural Equation Models
of both Equation 1.15 (representing the higher order structural model), and Equation 1.18 (representing the lower-order measurement model). The product of this combination, then, is:
y - Ay (r~
+ ~) +
&
(1.20)
THE LlSREL FULL STRUCTURAL EQUATION MODEL In contrast to a first-order CFA model, which comprises only a measurement component, and a second-order CF A model, for which the higher order level is represented by a reduced form of a structural model, the full structural equation model encompasses both a measurement and a complete structural model. Accordingly, the full model embodies a system of variables whereby latent factors are regressed on other factors as dictated by theory, as well as on the appropriate observed measures. In other words, in the full SEM, certain latent variables are connected by one-way arrows, the directionality of which reflects hypotheses bearing on the causal structure of variables in the model. For a clearer conceptualization of this model, let us examine the relatively simple structure presented in Fig. 1.8. The structural component of this model represents the hypothesis that a child's self-confidence (SCONF) derives from his or her self-perception of overall social competence (SSC; social SC) which, in turn, is influenced by the child's perception of how well he or she gets along with family members (SSCF), as well as his or her peers at school (SSCS). The measurement component of the model shows each of the SC factors to have three indicator measures, and the self-confidence factor to have two. Turning first to the structural part of the model, we can see that there are four factors; the two independent factors (~h ~2) are postulated as being correlated with each other, as indicated by the curved two-way arrow joining them, (c!>21) , but they are linked to the other two factors by a series of regression paths, as indicated by the unidirectional arrows. Because the factors SSC (111) and SCONF (112) have one-way arrows pointing at them, they are easily identified as dependent variables in the model. Residual errors associated with the regression of 111 on both ~1 and ~2, and the regression of 112 on 111, are captured by the disturbance terms ~1 and ~2 , respectively.
38
Chapter 1
'.
sDQSSCF2
X,
sOQSSCF3 X,
SDQSSCFl
x,
I.,
4
4
~
sseI' l
SDQSSC I Y.
>.
SDQSSCZ Y,
"
y
...
sse
,.
(.
"
~
,.'
SCONI' ~
(,
SCONSt Y•
SCONS2 Y,
SDQSSCl
Y,
sses \. SDQSSCSI)."
x.
SDQSSCSJ
X.
SDQSSCSl
••
X,
FIG. 1.8. Hypothesized Full Structural Equation Model with lISREl Notation.
Finally, because one path from each of the two independent factors (1;1,1;2) to their respective indicator variables is fixed to 1.0, their variances can be freely estimated; variances of the dependent variables (111, 112)' however, are not parameters in the model. By now, you likely feel fairly comfortable in interpreting the measurement portion of the model, and thus, substantial elaboration is not necessary here. As usual, associated with each observed measure is an error term, the variance of which is of interest. (Because the observed measures technically
Structural Equation Models
39
operate as dependent variables in the model, as indicated by the arrows pointing towards them, their variances are not estimated.) Finally, to establish the scale for each unmeasured factor in the model (and for purposes of identification), the first of each set of regression paths is fixed to 1.0; note again that path selection for the imposition of this constraint was purely arbitrary. For this, our last example, let us again determine if we have an identified model. Given that we have 11 observed measures, we know that we have 66 (11[11 + 1]/2) pieces of information from which to derive the parameters of the model. Counting up the unknown parameters in the model, we see that we have 24 parameters to be estimated: 7 measurement regression paths (4 AxS and 3 AyS; the factor loadings); 3 structural regression paths (YIl, Y12, ~Il); 11 error variances (6 as and 5 &s); 2 residual error variances (l;t, l;2); 1 covariance (c!>21). We therefore have 42 (66 - 24) degrees of freedom and, as a consequence, an overidentified model. Our final task is to translate this full model into a set of equations. With the structural model being summarized first, followed by the measurement model, these equations are:
measurement XI x. YI y.
-
1.00 ~I + al; 1.00 ~2 + 1.001'\1 + &1; 1.00 1'\2 + &.;
a.;
measurement X2 Xs Y2 ys
-
A21 ~I AS2 ~2 A21 1'\1 AS2 1'\2
+ a2; + as; + &2; + &5;
Xl - All ~I + ali X6-~~2+a6;
Yl - All 1'\1 + &l;
(1.21)
Because (a) by now the equations representing the measurement part of the model likely present no problem to you, and (b) the beta matrix (B) is new to you, further elaboration is limited to the structural portion of the model (i.e., the structural model). As such, the two structural statements in Equatiop. 1.21 can be summarized more compactly as: 1'\ - B1'\ + r~
+ l;
(1.22)
Equation 1.22 therefore represents the general LISREL model for a full structural equation model. Now let us focus on the decomposition of this equation. Recall that B represents a m x m regression matrix that relates the m endogenous factors (1'\s) to one another. Thus, because there are two
Chapter 1
40
endogenous factors in the model (111, 112)' we know that it will be a 2 x 2 matrix having four elements. Furthermore, being a regression matrix, B is estimated within the framework of a full matrix format. However, for mathematical reasons, IS the main diagonal of elements in B is always zero (Bollen, 1989b). Capturing the elements of the structural equations as expressed in the regression statements of Equation 1.21 within the context of the general LISREL full structural equation model (Equation 1.22), let us now review the latter in its expanded matrix form:
measurement
T}
r
T}
, T}1
measurement
measurement
T}1
T}2
measurement
measurement
T}2
+
¥u
o
¥21 0
measurement measurement measurement
l;
+
~J
(1.23)
For didactic purposes, albeit at the risk of excessive redundancy, I consider it worthwhile to review the order and composition of both the Band r matrices; we turn first to the B matrix. Because (a) there are two endogenous factors in the model (11h 112)' and (b) B represents a m x m matrix of endogenous factors, the B matrix will be of a 2 x 2 order thereby comprising four elements. In Fig. 1.8, we see that the single element, 1321, represents the influence of 11 I on 112 and, as its subscript indicates, is located at the juncture of the second row and first column. Turning to the r matrix, we know that it is a m x n matrix representing the regression of n exogenous factors (;s) on m endogenous factors (11s). Given that there are two ;s and two 11S in the model (Fig. 1.8), it follows that this matrix will also be of a 2 x 2 order thereby comprising four elements. As shown in Equation 1.23, and as indicated by their subscripts, YII is located at the juncture of the first row and first column of the r matrix, and YI2 at the juncture of the first row and second column. Finally, as illustrated earlier with respect to the second-order CF A model, the matrix of residual error terms (l;s) can also be expanded to reveal all elements in its variance-covariance matrix that are estimated within the
ISIt serves to remove 11 from the right side of the equation. As such, it assumes that the factor 11 does not cause itself.
Structural Equation Models
41
framework of the '¥ matrix. However, because this expansion process has already been presented, it is not repeated here. Now that you are familiar with the basic concepts underlying structural equation models, and with the related LISREL notation, let us turn our attention to details related to use of the LISREL program and its companion package PRELIS, as well as to use of the SIMPLIS command language.
CHAPTER
2
Using LISREL, PRELIS, andSIMPLIS The purpose of this chapter is threefold: (a) to introduce you to the general format and command language of the LISREL 8 program Ooreskog & Sorbom, 1989, 1993b), (b) to describe the major uses, general format, and command language of the PRELIS 2 program Ooreskog & Sorbom, 1988, 1993c), and (c) to describe the basic features of SIMPLIS aoreskog & Sorbom, 1993c), an alternate command language within the LISREL 8 program. (For simplicity, these programs are henceforth referenced as LISREL and PRELIS.) Expanded comments regarding various aspects of the input and output files are addressed in subsequent chapters that focus on specific CF A or full SEM applications. We turn first to the LISREL program.
WORKING WITH LlSREL 8 LISREL, an acronym for LInear Structural RELations, is grounded in the Joreskog-Keesling-Wiley approach to representing systems of structural equations (Bender, 1980). Thus, its command language is based on a matrix representation of the CFA or full structural equation model under study. In chapter 1 we examined the basic matrices, within the LISREL framework, used to describe both types of models (Ax, Ay, , 06, 0 B, r, 'JI). Now it is time to see (a) how these matrices are properly described to form admissible program statements and paragraphs used in the building of a LISREL input file, and (b) how to run a LISREL job so that the model specified in the input file is tested. We turn now to each of these LISREL operations. E,
43
44
Chapter 2
Creation of an Input File An input file describes the model under study. Typically it is composed of several sections, each of which is initiated with a specific word (or command); clustered within each section will be more specific statements called subcommands. In LISREL, all files can be decomposed into four sections-Title, Data Specification, Model Specification, and Output Specification; the latter three sections are required, whereas the Title section is optional. In building an input file using LISREL, we look first at the basic tenets governing the construction of these files, then we examine the components of their basic structure, and finally, we review input files related to the three example problems presented in chapter 1.
Basic Rules and Caveats Keywords.
With two exceptions, LISREL is controlled by two-letter keywords that represent both control lines and parameter names. Although these keywords may contain any number of letters (e.g., DAT APARAMETERS), only the first two are recognized by the program (e.g., DA). The one exception to this rule is the ALL option on the VA and ST commands. Key words can be written in either upper or lower case interchangeably, and must be separated by blanks. Maximum line length for LISREL files is 127 columns. However, commands may be continued over several lines by simply adding a space followed by a "c" (indicating continuation) at the end of each line to be extended. Because a keyword and its specified value must appear on the same line, it is essential that each new line begin with the repeated keyword if its specified value exceeds the 127-column delimiter.
Comments. The inclusion of comments at various points in an input file can often be useful in reminding you of certain information. LISREL makes this possible through the use of an exclamation mark (I) or a slash-asterisk combination (/*). As such, everything on a line that follows either of these two symbols will be ignored by the program. In addition, blank lines are accepted without the 1or /*.
Using lISREL, PRElIS, and SIMPLIS
45
Control Unes.
Each of the four sections comprising a LISREL input file is initiated with a particular control line: • Title (optional) • DAta
• MOdel
• OUtput
The two letters appearing in bold uppercase for each of the last three commands represent required keywords. Following these keywords will be additional subcommands that serve to complete the information required in order for the file to be properly executed. DA represents the data specification section of the file in which you describe your data. The MO keyword initiates the model specification section of the file where you describe the model being tested. Finally, the OU keyword allows you to tailor the type of information to be provided in the output file. In building the LISREL input file, it is essential that the title (if used), followed by information related to the three control lines, be incorporated in the following order: DA, MO, OU. We turn now to a more detailed description of each of these four sections of the input file. For our purposes in this section of the chapter, all examples used to illustrate various parts of the input file are based on the simple model presented in Fig. 1.3.
Title Specification As noted earlier, LISREL does not require you to include a title for your input file in order to run the job. Nonetheless, I highly recommend that you not only make a practice of always including a title, but also that you be generous in the amount of information that is included (see also Joreskog & Sorbom, 1993b). We have all experienced a situation in which what might have seemed obvious when we initially conducted an analysis may not seem quite so obvious several months hence. At this time we may wish to reexamine the data using the same input files. Indeed, liberal use of title information is particularly helpful in the case where numerous LISREL runs were implemented for a given data set. A given title for an input file can be entered on the first line without a two-letter keyword, and any number of lines may be used. However, because the program will read title lines until it encounters a line beginning with the letters "DA," "Da," "dA," or "da," the title may not begin with a word that
46
Chapter 2
is spelled "da - - - - -." One way to avoid this possible problem is to begin each title line with an exclamation mark (!). An example input file title, based on Fig. 1.3 is: CF A of the Self-Concept Structure Elementary Teachers Initially Hypothesized Model
Data Specification In addition to required information that must be entered in this section of the input file in order for the program to run, there is also optional information that can be included. Each is now considered separately.
Required Input.
The first control line keyword, as noted earlier, is DA. To fully describe the data to be analyzed, which is the purpose of this section of the file, additional information is provided by means of the following specifications: NGroups - number of groups for which data are available. If the data comprise one group only, this specification is not required.
NInpvar = number of input variables (i.e., number of observed variables in the data set
NObs - number of observations (i.e., sample size N). If N is unknown and raw data are being read in, NO should be left blank and, by default, the program will compute Nbased on listwise deletion 1 of missing data. If the data are read in the form of a covariance, correlation, or moment matrix, N must be specified, or the program will automatically stop. Finally, if this matrix were generated through the use of PRELIS using pairwise deletion2 of missing data, N should be a representative average of the pairwise sample size. MAtrix - matrix to be analyzed. • KM - correlation matrix. • eM - covariance matrix. • MM - moment matrix. • AM - augmented moment matrix. • PM - polychoric or polyserial correlation matrix.
lin the computation of a correlation or covariance matrix, cases having at least one missing score are ignored by the program for all variables in the data. 21n the computation of a correlation or covariance matrix, cases having a missing score for a particular variable are deleted from the computation for that variable only.
Using lISREL, PRElIS, and SIMPLIS
47
Let us stop for a moment and review an example of data description based on the syntactical information provided earlier. ! DA NI-4 NO-400 MA-CM! In this example, the DA line indicates that the data to be analyzed comprise four input variables, the sample size is 400, and the data are to be analyzed as a covariance matrix. Given that no NG specification is provided, the program will consider the data to be relevant to one group only. It is important to note that the matrix to be analyzed may not necessarily be the same as the one used in reading the data. More specifically, the LISREL program requires two pieces of information regarding the data before the analyses can be invoked: (a) the type of matrix by which the data are to be analyzed (the specification entered earlier), and (b) the type of matrix by which the data are to be read (to be discussed next). Thus, although the example shown here indicates that the covariance matrix is to be analyzed, the data may be read into the program from, for example, a correlation (KM) matrix. This point becomes more evident as we move on to examine further command language related to the specification of data. The DA command, therefore, is further defined by four additional pieces of information, each of which is entered on a separate line: 1. The first line to follow the DA control line contains the two-letter keyword LA, indicating that the variable labels are to follow. 2. The next line provides the names for each of the input variables. Although these labels can be of any length, only the first eight characters are retained by the program. The easiest way to enter this information is to use free format. As such, the entry of labels must be consistent with both the number of observed variables and the order in which they appear in the data. All variable labels must be separated by delimiters (blanks, commas) and, thus, blanks and commas cannot appear within the label name. 3. The third line provides two pieces of information describing the data being read by the program: (a) the type of matrix within which they are contained (cf. matrix type to be analyzed noted earlier), and (b) its matrix form. This information is represented by two-letter keywords entered on the same line and separated by a blank. The matrix types were noted earlier; the two possible matrix forms are: • FU - full matrix. • SY - symmetric matrix. 4. Line 4 specwes the link between each number in the data set and the variable labels. In other words, it tells the program how to read the input data by providing it with a formula, otherwise known as the data format. This format can be either fixed or free. In either case, LISREL reads all data row-wise from left to right.
48
Chapter 2
Fixed format indicates that the data are to be read according to a detailed formula known as a FORTRAN statement. This format specifies the number and location of columns occupied by each variable, as well as the number of decimal points included for each variable score, if any. In LISREL, the FORTRAN statement must always begin with a left parenthesis and end with a right parenthesis. Free format, on the other hand, indicates that the data are to be read as one long string of numbers, with no specific column{s) being associated with any particular variable. The only requirement with free format is that each variable score be separated by a blank.3 These three (or four, if a format statement is specified) lines just described are then followed by the data, which can either reside within the input file itself, or in a separate external file. The first two following examples illustrate data that are embedded within the input file, whereas the third example demonstrates the use of an external data file. Based on Fig. 1.3, the first example demonstrates data specification based on a fixed format; LA SDQASC SPPCASC SDQSSC SPPCSSC KMSY (4F3.2)
100 70100 50 55100 45 48 80100
Here we see that we have four observed variables which are entered as a symmetric correlation matrix. Relatedly, the FORTRAN statement indicates that there are four variables, each comprising a three-digit score taken to two decimal places. Thus, LISREL would read the data from left to right, counting three digits and then placing a decimal point to the left of the last two digits for each variable. Blanks represent zeros and are therefore taken into consideration in counting the three digits. The second example represents the same data matrix, but this time is entered in free format:
3 In older versions of the program, an asterisk was required in the specification of free format. This is no longer necessary with LISREL 8.
Using lISREl, PRElIS, and SIMPLIS
49
LA SDQASC SPPCASC SDQSSC SPPCSSC KMSY 1.00 .701.00 .50.55 1.00 .45 .48 .80 1.00 Finally, this third example demonstrates a case where the data reside in a separate file external to the input file. LA SDQASC SPPCASC SDQSSC SPPCSSC KM-SC.DAT FO
(4F3.2)
The line following the list of variable names indicates that the data are in the form of a correlation matrix and reside in a file called "SC.DAT." The "FO" indicates that the format for this data set follows on the next line of the input file, rather than being in the data file; as with the first of these three examples, this FORTRAN statement must be enclosed in parentheses.
Optional Input. We turn now to optional information that can be added to the DAta specification section of the input file. These include the entry of mean and standard deviation values, and a selection statement that controls the order in which the variables are read. We turn first to the entry of means and standard deviations.
Means and standard deviations. Generally speaking, the only time that the means are of interest is in the analysis of mean structures, a topic that is addressed in chapter 10 of this volume. (All other applications in this book are based on the analysis of covariance structures.) The input of standard deviations, on the other hand, is extremely useful if the data are only available as a correlation matrix; provided with the standard deviations, the LISREL program can compute the covariance matrix. Thus, in light of current knowledge that the analysis of correlational data can be problematic
50
Chapter 2
in the analysis of covariance structures (Cudeck, 1989; Jareskog & Sarbom, 1988), it is important that LISREL analyses be based on the covariance • 4 matnx. The addition of means, standard deviations, or both to the input file involves two lines each, for which there is entered the appropriate keyword (ME, SD, respectively); each is followed by the actual values which must be separated by blanks. Consistent with the entry of data as discussed earlier, means, standard deviations, or both can be specified either in free format (see footnote 3, this chapter), or preceded by a fixed format statement. Two important caveats related to the entry of means and standard deviations are to ensure that (a) the number of values entered is consistent with the number of variables listed, and (b) the order of these values is consistent with the order of entered data. Finally, the two (or four) lines describing this information follow immediately after the entry of data. An example that illustrates the inclusion of both mean and standard deviation values is:
ME 76.41 52.9055.6049.22
SD
10.109.054.567.80
Selection of variables.
The program reads data in the order by which the variables are written in the data file; the labeling of variables on the LA line, therefore, should be consistent with this order. However, for a variety of reasons, you may wish to use only certain variables from those listed on the LA line, or you may wish to have the variables read in an order that differs from the way they are listed on the LA line. For example, in a full structural equation model in which there are both dependent (Ys) and independent (Xs) variables, the reading of the Y-variables must precede that of the X-variables. This is easily accomplished by adding a line that begins with the SElection keyword immediately following the row of SDs. In a separate line, immediately following the SE line, the variables (either by name or number) are then listed in the order by which they are to be read. If certain variables are to be excluded from the analyses, then the SE line
41£ the data are entered in raw matrix form, and MA=CM, the program automatically computes the covariance matrix.
Using lISREL, PRElIS, and SIMPLIS
51
should end with a slash (I). For example, based on Fig. 1.3, suppose that we did not wish to include the variable SPPCASC in the analyses. We can easily eliminate it by including the SE line as follows: SE 134/ or as,
SE SDQASC SDQSSC SPPCSSC/
As noted earlier, the SE line is also used to alter the order in which the variables are read by the program for purposes of specific analyses. In the previous example, the variables are being read automatically in the order-1 2 3 4, the order in which they appeared in the correlation matrix. As such, an SE line is not required. However, if the variables are to be read in a different order, the SE line is necessary; again, reference can be made either to their labels or their ordered number in the list of variables on the LA line. The following example illustrates this flexibility of variable selection. SE 1324 or as,
SE SDQASC SDQSSC SPPCASC SPPCSSC
In this example, therefore, the variable SDQSSC is to be read prior to the variable SPPCASC, albeit the latter was entered first in the original data-entry process. • Hint: If the SE line has been used to indicate the exclusion of particular variables but the slash has been overlooked, LISREL will print an error message indicating that the run was problematic. However, this message will, in no way, direct you to the syntactical error of the missing slash symbol. 5 Thus, the fact that the job was not executed correctly should alert you to the fact that something is wrong with the input file. We next focus our attention on the third component of the input file that addresses the issue of model specification. 5Essentially, the program continues to read the LISREL command language on the next line as labels.
52
Chapter 2
Model Specification The two-letter keyword that alerts the program to specifications related to the model under study is MO. Essentially, the MO line requires the input of four pieces of information: • • • •
The number of observed variables in the model (xs; ys). The number of latent variables in the model (~s; 11s). The form of each matrix to be analyzed. The estimation mode of each matrix.
We turn now to each of these components.
Observed Variables.
Model specifications can include only x-variables, only y-variables, or both. The keywords are: NX - number of x·variables in the model. NY - number of y-variables in the model.
Latent Variables.
Model specifications can include only ;s, only 11S, or both. LISREL refers to these latent variables as KSI and ETA, respectively. The keywords are: NK - number of I;s in the model. NE - number of 11S in the model.
Basic Matrix Forms. To fully comprehend the CFA and SEM applications in this book you will need to be familiar with the structure of five basic matrix forms. These matrices are schematically portrayed as: xx xx xx xx
~ ~~ ] x
x x x x x x x x x Symmetric Matrix measurement measurement 1 000
ol?g~O 1 mea0surement
~
me0 asurement
001 0 000 1 Identity Matrix
xx xx xx xx
xx xx xx xx
xx xx xx xx
Full Matrix x
x
x
x Diagonal Matrix
o0 0 0 000 0 o0 0 0 o 0 0 0 Zero Matrix
oggOru
~
Using lISREL, PRElIS, and SIMPLIS
53
Where x - parameter to be estimated; 1 - a fixed value of 1.00; 0 - a fixed value of 0.0. Several points are worthy of note here, with respect to the treatment of these matrices by the LISREL program: 1. The "x"s in the full (FU), symmetric (Sy), and diagonal (OI) matrices represent values that are stored in computer memory. This means that if a matrix is specified as OI, one cannot refer to off-diagonal elements in the matrix as these values are not stored in the computer. For example, suppose that a variance-covariance matrix Phi were specified as a diagonal matrix with variance estimates freely estimated (PH OI,FR). Any specification of a freely estimated covariance (e.g., PH[3,2] is not possible and would therefore result in the issuance of a syntax error, at which point the program stops. 2. Likewise, if a matrix is stored as an identity (IO) or zero (ZE) matrix, one cannot refer to any element in these matrices.
Matrix Estimation Mode.
In addition to specifying the form of the various matrices involved in a particular model specification, the LISREL user must also assign a mode of "free" (FR) or "fixed" (PI) to each. Specification of the mode of a matrix (as FR or FI) means that all elements in that matrix are assigned the same mode. FRee parameters are those whose values are unknown and therefore will be estimated by the program. FIxed parameters are assigned some value by the investigator a priori and, thus, they are not estimated by the program. Let's examine a couple of examples to illustrate matrix estimation mode. ILX-FU,FII
TD-DI,FR
In the example on the left, the Lambda-X matrix is specified as a full matrix (its matrix form), with all elements fixed to some value assigned by the user (its matrix mode). The default value for fixed parameters is 0.0. This means that if no value is assigned to these fixed parameters, the program will automatically constrain their value to zero. In the second example, the Theta Delta matrix is specified as a diagonal matrix with all elements free to be estimated by the program. Before proceeding further with this description of model specification, let us take a slight digression in order to examine the concept of default values, an operation inherent in all computer programs. Basically, the idea underlying the use of default values is to ensure that the program is provided with information essential to its execution of a job in the event that the user, either
54
Chapter 2
by choice or otherwise, failed to include. As such, the program automatically implements a fixed value that has been preselected on the basis of most common use. With LISREL, default values are associated with particular matrix specifications. Once a user is fully familiar with the LISREL program, he or she may wish to make use of the matrix default values in model specification, as they can reduce the time involved in building the input file. However, because it is essential that the novice LISREL user become fluent in making the cognitive link between the model, as presented diagrammatically and as specified in LISREL lexicon, I strongly encourage you to always enter the full matrix description, regardless of what the default value might be. In addition to assigning a mode of FIxed or FRee at the matrix level, one must also make such assignment at the parameter (or element) level; model specification relative to matrix parameters is addressed next. However, before turning to this topic, it would be helpful to review the LISREL default values assigned at both the matrix and parameter levels. Although I recommended that you not concern yourself with the default form of matrices (at least while you are learning the program), it is important that you know the default mode of the parameters within each matrix because this information will determine how you specify the mode of each parameter in the model. These default values are presented in Table 2.t. To review how these default values work, let us examine a couple of examples. First, if the matrix LX were to be entered on the MO line, with no further description regarding either its form or mode, the program would automatically consider it to be a full matrix with all elements fixed to zero (FU,FI), unless otherwise specified later in the input file (discussed below). Likewise, if PH appeared on the MO line as PH = SY, with no indication of matrix mode, the parameters in this matrix would be freely estimated {by default}. Let us now move on to examine additional information that is required to complete the model specification process.
Additional Model Specification. One helpful piece of information that can be incorporated into the file at this point is the assignment of labels to the latent variables in the model. Although it is optional, I strongly urge you to use this feature as it can aid substantially in the interpretation of the output file. Consistent with the labeling of observed variables, the procedure for latent variables involves two separate lines for each of the
55
Using LlSREl, PRELlS, and SIMPLIS
table 2.1 Default Values for L1SREL Matrices
Matrix Name
Greek
L1SREL
Matrix
Matrix
Notation
Notation
Form
Mode
Lambda-X
Ax
LX
FU
F1
Lambda-Y
Ay
LY
FU
F1
Gamma
r
GA
FU
FR
Beta
B
BE
ZE
F1
Phi
PH
SY
FR
Psi
,.
PS
01
FRa
Theta Delta
e~
TO
01
FR
Theta Epsilon
e.
TE
01
FR
Theta Delta Epsilon b
e~e
TH
ZE
F1
t
a with earlier versions of L1SREL, the default value for the Psi matrix was SY,FR. b This matrix was introduced in L1SREL 8; for a discussion and application, see Chapter 12 of this volume.
dependent and independent fa~tors; they follow immediately after the MO line. The following example should clarify this procedure; as such, LE represents labels for the etas (TIS) and LK, labels for the ksis (~s). LE AAASC LK MSCESCSCS In most cases, model specification requires further detail beyond the matrix level. This additional refinement lies with the estimation mode of the individual matrix elements (i.e., the parameters in the model). More specifically, regardless of the mode of the matrix as a whole, the individual parameters within a particular matrix may be specified as fixed (FI), free (FR), or constrained equal to some other parameter(s; EQ). F~r example, despite
Chapter 2
56
the fact that a matrix may be specified as, say, FI, meaning that all parameters in the model are fixed to some preassigned value, any parameter in that matrix may be individually specified as free. As such, matrix elements are identified by parenthesizing their coordinated points (i.e., their intersecting row and column numbers, as shown in chapter 1), which are separated either by a comma or by a blank. (For reasons of clarity, my preference lies with the comma and, thus, this syntax is used throughout the book.) To illustrate the specification of individual parameters, let us turn to the simple two-factor CFA model in Fig. 1.4. However, for didactic purposes let us consider the two factors to be uncorrelated. We turn now to the MO line related to this model: MO NX ... 4 NK-2 LX-FU,FI PH=SY,FR TD=DI,FR FR LX(2,1) LX(4,2) FI PH(2,1) As indicated earlier, the MO line specifies that the model has four X (observed) variables and two latent factors (~s). The Lambda-X matrix is a full matrix with all elements fixed (to 0.0 by default); the Phi matrix is symmetric, with all elements freely estimated; and the theta delta matrix is diagonal, with all elements freely estimated. In the two lines that follow the MO line, we see that, within the LX matrix, there are two elements to be freely estimated, and within the PH matrix there is one element to be fixed. Let us turn first to the LX matrix, and view it from three different perspectives.
(a)
~l
~2
(F~
Xll (FI) LX l2 LX 21 (FR) LXn (FI) [ LXJI (FI) LXn (FI) LX41 (FI) LX42 (FR) Matrix With LISREL Notation
(b)
~1
~2
All (fixed) AI2 (fixed~ A21 (free) An (fixed) AJI (fixed) An (fixed) [ A41 (fixed) A42 (free) Matrix With Greek Notation
Using lISREL, PRElIS, and SIMPlIS
(c)
~I
~
.O
57
~2
O.OJ
A21 0.0 0.0 1.0 0.0 A42 Matrix With Assigned Values for Fixed Parameters Matrices a and b are fairly straightforward and, thus, you likely encountered no difficulty in their interpretation. For purposes of review, however, let us take a brief look at matrix c. The two-factor model depicted in Fig. 1.4 postulates that two observed variables load on each factor in a specific pattern. Within the framework of the Ax matrix, then, An and A21 are hypothesized to load on Factor 1 (~I)' and An and A42 to load on Factor 2 (~2). Thus, A3h A4h A12, and A22 are fixed to zero. However, recall that for purposes of statistical identification, and to establish the scale of the latent variables, one of the two estimable parameters for each factor can be fixed, say, to a value of 1.00. Although typically researchers tend to fix the first of a set of AS to 1.00, this assignment is not always optimal. More appropriately, the indicator variable that is used as a scale reference variable should be the one that measures the underlying factor best (i.e., it is the most reliable and valid; J6reskog & S6rbom, 1993a). As noted in chapter 1, an alternative approach to establishing the scale of the latent variables is to allow these factors to be standardized by the program. In contrast to the previous version of the program, which could standardize only the independent factors (the ~s), LISREL 8 can perform the task regardless of whether the latent factors are independent (~) or dependent (T)) variables in the model. This approach to the scaling of the latent factors requires two changes to our model specification illustrated earlier. First, the phi matrix (cI» would be specified as PH = ST, in lieu of PH - SY,FR. Second, the VA line (see next section) would not exist as it must, when reference variables in the A matrix are fixed to 1.00. However, it is important to note that specification of the ST specification is that the factors are hypothesized as being correlated with one another. This alternate specification, then, would be as follows: MO NX-4 NK-2 LX-FU,FI PH-ST TD-DI,FR FR LX(l,l) LX(2,1) LX(3,2) LX(4,2)
Chapter 2
58
The ST keyword engenders a special meaning whereby the diagonal elements of the matrix (i.e., the variances) are fixed to 1.00 and the off-diagonals freely estimated. Furthermore, the ST specification cannot be overridden by fixing individual elements, freeing individual elements, or both within the matrix. Thus, the specifications PH - ST,FI and PH - ST, FR are not permissible. Now let us examine model specification related to the second selected matrix-the variance-covariance phi matrix. In the MO line, this matrix is specified as being symmetric and free. This means that the variance of both Factor 1 (cjll1) and Factor 2 (cjl22), in addition to their covariance (cjl21) will all be freely estimated. However, recall that, for purposes of this illustration, the two factors were deemed to be uncorrelated; thus the need to fix cjl21 to zero. Consistent with model specification, therefore, this matrix can be presented as:
measurement measurement
measurement
measurement measurement
It is worthwhile to note that model specification related to the LambdaX and Phi matrices illustrated earlier, could just as easily have been in reverse. That is to say, in lieu of the LX matrix being specified as fixed with two of its parameters freely estimated, it could have been specified as free, with six of its parameters fixed (either to zero or to 1.00). The following revision of the previous model specification takes these reversals into account. MO NX-4 NK-2 LX-FU,FR PH-SY,FI TD-DI,FR FI LX{l,l) LX(3,l) LX(4,l) LX(1,2) LX(2,2) LX(3,2) FR PH(l,l) PH(2,2) A comparison of these statements with the earlier ones will quickly demonstrate that the two model specifications are identical; thus it easy to see that there is more than one way to specify a model. For sake of expedience, it is best to specify the matrix in accordance with the intended mode of most of its elements. For example, if the majority of the elements are to be freely estimated, then specify the mode of the matrix as FR. If, on the other hand, most matrix elements are to remain fixed, then specify the
Using LlSREl, PRELlS, and SIMPLIS
59
matrix as FI. By heeding this recommendation, less input will be required in specifying the mode of the individual parameters in each matrix. In both examples illustrated earlier, however, LX(l,l) and LX(3,2) were used as reference variables and, in this case at least, had a fixed value of 1.00. Therefore, the topic to which we turn next addresses this issue of value assignment for fixed parameters.
Value Assignment of Fixed Parameters.
In the case where a fixed parameter is to be assigned a nonzero value, rather than the default value of 0.0, this can be accomplished by means of either the VAlue or STart control lines; the two lines are equivalent and, thus, can be used synonymously. In either case, the keyword is followed by a number, which is taken as the assigned value for the identified list of matrix parameters. The following example serves to illustrate this type of specification. FI LX(l,l) LX(3,2) VA 1.0 LX(l,l) LX(3,2) or as
ST 1.0 LX(l,l) LX(3,2)
This example indicates that elements (1,1) and (3,2) in the LX matrix are to be fixed to a value of 1.0. For convenience, it is often advisable to use the VA keyword for the assignment of nonzero values to fixed parameters, and the ST keyword for the assignment of start values to free parameters. An elaboration of the function of start values now follows. Start values refer to the points at which a program begins the iterative process to establish parameter estimates. That is, in the process of determining the final estimated values, the program necessarily generates many sets of improved parameter estimates in an effort to reach an optimal solution. Users can either allow the LISREL program to supply these values or can provide their own values. User-provided start values represent a "best guess" of what the expected value of a particular parameter estimate will be. Although, in general, start values are not needed for most LISREL jobs, they often facilitate the iterative process when complex models are under study; such models can generate problems of nonconvergence if the start values provided by the program are inadequate. Nonetheless, because of the multiplicity of models that can be analyzed using the LISREL program, it is
60
Chapter 2
almost impossible to furnish any firm set of rules governing the formulation of start values. Speaking in general terms, however, I can suggest that a few key factor loadings (A.s) should be fairly high (-.9 in a standardized metric), variances should always be larger than covariances, and error variances should be close to the variances of their corresponding derived variables. Moreover, if you either know or suspect that some estimates will be negative, it is very helpful to specify this sign, because negative estimates can be problematic to the iterative process (i.e., may lead to nonconvergence). (For a more extensive discussion of start values, see Bollen, 1989a; Hoyle, 1995a.) -Hint: If the program terminates and yields the message that "serious problems were encountered during minimization" (see actual error message later in this chapter), it is often the case that the user-provided start values are far from the true estimated values; as such, the program was unable to attain a convergent solution. As a consequence, the researcher will need to revise the start values in order to speed up the analytic process. One effective guideline to use in this revision is to check the estimated values from the previously aborted output. Although these values do not represent final estimates, they are often closer to the target values than are the userprovided start values.
Output Specification Decisions regarding the output file from an executed LISREL job fall into two categories: method of parameter estimation and information related to analyses. This information is specified on the OU control line. We turn now to these two categories of specifications.
Method of Parameter Estimation. LISREL 8 offers a choice of seven methods of parameter estimation: instrumental variables method (IV), two-stage least squares (TSLS), unweighted least squares (ULS), generalized least squares (GLS), maximum likelihood (ML), weighted least squares (WLS), and diagonally weighted least squares (DWLS). Because the purposes and underlying assumptions for each of these methods differs, you are strongly advised to select the one most appropriate for your data. (For a detailed review of each of these methods, readers are referred to Bollen, 1989a; Hayduk, 1987;Joreskog & Sorbom, 1988, 1993b)
Using LlSREl, PRELlS, and SIMPLIS
61
The IV and TSLS methods are fast and are not based on an iterative process; all remaining methods compute estimates iteratively and require start values for all independent free parameters. These start values are produced by IV for the ULS method, and by TSLS for the GLS, ML, WLS, and DWLS methods. Weighted least squares estimation requires an asymptotic covariance (AC) matrix, whereas thhe DWLS method requires the asymptotic variances (AV) from the AC matrix. As noted earlier, both of these data matrices are produced by PRELIS. The application described in Chapter 5 illustrates use of the AC matrix and WLS estimation. Finally, LISREL recognizes the ML estimation method as default. Thus, if no method is specified on the OU line, analyses will automatically be based on maximum likelihood estimation. • Hint: If the user wishes to enter his or her own start values on the initial run, but has no idea what values to use, it can be helpful to obtain these from the program by requesting only the initial estimates. This is easily accomplished by entering the keyword IV on the OU line; values are equivalent to those generated by TSLS. If more than one of these keywords is entered on the OU line, LISREL will only recognize the last one entered.
Information Related to Analyses.
Although LISREL provides a standard output, the user may wish to obtain additional information bearing on the analyses. As such, he or she can select from a number of options that result in this information being printed in the output file. The appropriate keyword is simply added to the OU line. Options of most interest to novice users of LISREL are listed here, along with their keywords. 6
MI - Print modification indices. PC - Print correlations of estimates. 1\ RS - Print fitted v~riance-covariance matrix (L), residuals (s· r), normalized residual, Q-plot. 7 EF - Print total effects. FS - Print factor scores regression. 8 6Users of earlier versions of LISREL will need to add the keywords SE and TV if they wish to obtain material related to the standard errors and t values, respectively. These values are automatically included in a reformulated outfile file for USREL 8. 7Each of these will be discussed in detail in chapter 3, where we examine the first application of LISREL to real data. 8Discussion related to the production of factor scores for CFA models using USREL is relatively advanced and {;oes beyond the intended scope of this book. Interested readers are referred to Bollen (1989b) and Joreskog & Sorbom {1993b}.
62
Chapter 2
SS '"' Print standardized solution Qatent variables only}. SC - Print completely standardized solution Qatent and observed variables}. NS = No automatic start values. PT - Print technical output. AD = n Check admissibility of solution after n iterations (n '"' 0, 1, 2.....); default ... 20.9 AD=OFF .. Set admissibility check off. AL - Print all output. ND - Number of decimal places to be printed (0-8); default ND = 2. 10 TM .. Maximum number of CPU-seconds allowed for a job; default TM '"' 60. The meaning and interpretation of most of the information just presented are discussed either in the section dealing with error messages, or as they relate to specific applications in the ensuing chapters of the book. • Hint: In the interest of conserving the global forests, I encourage you to be stringent in your printing of the output file by requesting only output that is directly of use to you at anyone step in your analyses. For example, it only makes sense to request the standardized solution after you have attained the final best-fitting model.
Input File Examples To give you a more comprehensive view of how the various parts of the LISREL input relate to the path diagram of a particular model, let us now apply what we have learned so far to the last three models presented in chapter 1 (Figs. 1.6, 1.7, and 1.8). To derive maximum benefit from this section, you are urged to study each figure as you refer to its respective input statement. Because these examples are intended only to illustrate the basic structure of a LISREL input file, start values have not been included. In addition, the OU control line has been left blank, which would result in the printing of the default standard output file. This would include a list of the control lines in the input file, the parameter specifications, the matrix to be analyzed, ML parameter estimates, and overall goodness-of-fit indices. To begin we turn to the first-order CF A model portrayed in Fig. 1.6. The related LISREL input file is described in Table 2.2. Although this input file is fairly straightforward, several items are perhaps worthy of review. First, as indicated by the RA keyword, and the remainder of its line, we can see that the data are in the form of a raw data matrix which 9This default value was 10 with LISREL 7. lOoy-his default value was 3 with LISREL 7.
Using lISREL, PRElIS, and SIMPLIS
63
Table 2.2 LISREL Input for Hypothesized First-order CFA Model ! 1st-Order CFA of 4-Factor Model of Self-concept ! Initial Model
DA NI=12 NO=250 MA=CM
LA
SDQASC1 SDQASC2 SDQASC3 SDQSSC1 SDQSSC2 SDQSSC3 SDQPSC1 SDQPSC2 SDQPSC3 SDQESC1 SDQESC2 SDQESC3 RA=CFASC.DAT FO (12F1.0) MO NX=12 NK=4 LX=FU,FI PH=SY,FR TD=DI,FR LK ASC SSC PSC ESC FR LX(2,l) LX(3,l) LX(5,2) LX(6,2) LX(8,3) LX(9,3) LX(11,4) FR LX(12,4) VA 1.0 LX(l,l) LX(4,2) LX(7,3) LX(10,4) OU
resides in a separate file having the name "CFASC.DAT." The format statement for these data indicates that, for each case, there are 12 variables (12F) each of which occupies one column and no decimal (1.0). Second, for purposes of identification, as well as for setting the scale for the latent factors, the first measurement indicator for each factor has been specified as fixed-LX(l, 1), LX(4,2), LX(7,3), and LX(10,4). As indicated on the VA line, these factor-loading parameters have been given a value of 1.0. Third, although the TD matrix represents a variance- covariance matrix of random measurement error, it is specified as a diagonal matrix. This specification derives from the underlying assumption that the error terms are uncorrelated among themselves. Nonetheless, correlated errors do occur and we address this issue in later chapters. Finally, as noted earlier, although many of the model specifications used throughout this book represent LISREL default specifications and could therefore be absent from the MO line (e.g., the LX, PH, and TD specifications noted earlier), the full description is provided for both clarity and didactic purposes. Let us now turn our attention to the second-order CF A model presented in Fig. 1.7. The LISREL input file for this model is shown in Table 2.3. As a follow-up to the previous CF A example, and in conjunction with the labeled model in Fig. 1.7, this input file should be fairly clear. Nonetheless, one important specification that perhaps needs some elaboration relates to the higher order factor (~I). Two critical points are noteworthy. First, recall from chapter 1 that, for purposes of identification and the establishment of a scale metric for the latent variables one can constrain either
64
Chapter 2
Table 2.3 LISREL Input for Hypothesized Second-order CFA Model ! 2nd-Order CFA of 5-Factor Model of Self-concept ! Initial Model DA NI=12 NO=250 MA=CM LA
SDQASC1 SDQASC2 SDQASC3 SDQSSC1 SDQSSC2 SDQSSC3 SDQPSC1 SDQPSC2 SDQPSC3 SDQESC1 SDQESC2 SDQESC3 RA=CFASC.DAT FO (12F1.0) MO NY=12 NK=l NE=4 LY=FU,FI GA=FU,FR PS=DI,FR TE=DI,FR LE GSC LK
ASC SSC PSC ESC FR LY(2,1) LY(3,1) LY(5,2) LY(6,2) LY(8,3) LY(9,3) LY(11,4) FR LY(12,4) VA 1.0 LY(l,l) LY(4,2) LY(7,3) LY(10,4) VA 1. 0 PH (1, 1) OU
one of a set of indicator variables, or the variance of the factor on which they load to some nonzero value. Typically, this value is 1.0. In the present example, you can observe that all second-order factor loading parameters (the ys) are specified as freely estimated. As a consequence, the higher order factor on which they load (SI) must be fixed to 1.0. Thus, you will note a VA line specified for PH(l,l), the variance of higher order factor. Second, recall also, that in second-order models any covariance among the first-order factors is presumed to be explained by the higher order factor{s). Thus, in contrast to our previous CFA example, this file specifies no Phi matrix, which would have housed the factor covariances; only the variance of the higher order factor is linked to this matrix. Now, we turn to our last example, the full structural equation model depicted in Fig. 1.8. The input for this model is shown in Table 2.4. Once again, if you study this LISREL input file in combination with its related model (Fig. 1.8), you will likely find everything to fall into place fairly well. However, I consider it worthwhile to highlight a few aspects of this file. First, note that the SE line has been invoked for purposes of altering the order by which the observed variables will be read. This reordering of the variables addresses the LISREL requirement that the entry of dependent variables precede that of independent variables. As such, Variables 7-11 (the Y-variables in the model) are entered firs~. Second, note that the Beta matrix (B) has been specified as full and fixed (FU,FI), whereas the Gamma matrix
Using L1SREL, PRELlS, and SIMPLIS
6S
Table 2.4 LISREL Input for Hypothesized Full structural Equation Model ! Impact of Social Self-concept on Self-confidence ! Initial Model
DA NI=ll NO=250 MA=CM
LA
SDQSSCF1 SDQSSCF2 SDQSSCF3 SDQSSCS1 SDQSSCS2 SDQSSCS3 SDQSSC1 SDQSSC2 SDQSSC3 SCONS1 SCONS2 RA=SSCCON.DAT FO (9Fl. O,X,2F2. 0) SE 7 8 9 10 11 1 2 3 4 5 6 MO NY=5 NX=6 NE=2 NK=2 LY=FU,FI LX=FU,FI BE=FU,FI GA=FU,FR C PH=SY,FR PS=DI,FR TE=DI,FR TD=DI,FR LE SSC SCONF LK
SSCF SSCS FR LY(2,l) LY(3,l) LY(5,2) FR LX(2,l) LX(3,l) LX(5,2) LX(6,2) FR BE(2,l) FI GA(2,l) GA(2,2) VA 1.0 LY(l,l) LY(4,2) LX(l,l) LX(4,2) OU
(r) has been specified as full and free (FU ,FR).11 If you now tum to the matrix
equation involving both the B and r matrices, you will clearly recognize the need to specify BE(2,1) as free in a matrix of fixed 0.0 values, and GA(2,1) and GA(2,2) as fixed in a matrix that is specified as free. Finally, note that the letter "c" appears on the MO line, indicating that specification continues on the next line. By now you should be relatively familiar with the basic process in composing a LISREL input file. Before proceeding to its companion package PRELIS, however, it may be helpful to say a few words about error messages that may appear in the output file.
The L1SREL Output file: Error Messages Because the remaining chapters of this book are devoted to a thorough discussion of both the input and output files devoted to various applications of LISREL, there is little reason to examine output related to the three hypothetical examples just reviewed. However, a few comments related to error messages, which most certainly will appear at some time in your output file, would seem to be in order. 11 Actually, the GA matrix could just as easily have been specifed as FU,FI, in which case GA(l,l) and GA(l,2) would have been specified as free.
66
Chapter 2
Errors are inevitable, regardless of how familiar one is with various computer environments and software packages. Indeed, viewed from a positive perspective, they can often serve as important education milestones in the learning of a particular program. Typically, error messages produced by standard statistical packages provide some clue as to the location and correction of the error. However, this task is substantially more difficult for structural equation modeling programs due to the combined complexities associated with both the data and the specified model. Thus, in using LISREL, you may find yourself faced with a message that in no way directs you to the problem. These messages can often be triggered by simple syntax errors such as: omitting a symbol (e.g., the slash on the SE line), omitting a blank or a comma (e.g., LX-FUFI), or misspelling a keyword. Nonetheless, although the program will detect most syntactical errors, it cannot uncover an incorrectly specified number of variables. The first thing to do when confronted with an error message, therefore, is to carefully screen your input file because, in general, most LISREL problems tend to result from typographical or syntactical errors. I strongly urge you to reexamine your input file with a critical eye before running the job. In the long run, it can save you many hours of distress! In general, when the specified model and the data are compatible, LISREL works well in yielding good initial estimates and converging after only a few iterations to an admissible solution. In practice, however, such a scenario is not a common occurrence largely because various and numerous conditions can hinder this match. Basically, these impediments fall into two categories: those that are minor and those that are serious. We turn now to a brief summary of possible reasons for the triggering of these messages.
Minor Errors. These errors are typically the result of syntax errors. Although LISREL will detect most errors of this nature, some can not be determined until after the job has been executed (e.g., appropriateness of start values). A few of the most common errors in this category are: • Using key words that do not conform to the LISREL naming convention. • Omitting required slashes, equal signs, commas, blanks, or any combination thereof. • Placing the DA, MO, and OU control lines in the wrong order. • Making invalid specifications on the MO, FR, FI, and VA control lines (e.g., making reference to LY when NK has been specified. • Leaving pairs of parentheses unmatched.
Using LlSREL, PRELlS, and SIMPLIS
67
Serious Errors.
Although the detection of syntax errors is fairly simple to program, the ferreting out of data errors is more complex. As a consequence, Jareskog and Sarbom (1989) recommended that all data be read from external files. As such, LISREL can determine if there are too few elements, illegal characters in the data, or both. However, the program cannot detect if there are too many elements, or unreasonable (albeit legitimate) values in the data. Thus, it is always wise to check the correctness of the matrix being analyzed that will appear on the output file. If, indeed, there are erroneous values in the matrix under study, this can trigger the following error message: F _A _ T _A_ L E_ R_R_0 _R : Matrix to be analyzed is not positive definite. Beyond errors due to data problems, there are others that can arise from model misspecification. For example, the model specified on the MO line of the input file may not be the one intended, or it may simply be an inappropriate model for the data in hand. LISREL objects strongly to bad models, and this objection is manifested in the generation of error messages that are related to the following problems: • The initial estimates are poor. This situation can generally be linked to poor start values (i.e., parameter values at which the iteration process begins). That is, the start values are so far away from the actual parameter estimates that the solution does not converge within the allotted number of iterations (default - 20). This can happen, for example, if the start value entered is a positive number, whereas the true estimated value is negative. Indeed, in some cases, the initial estimates may be so bad that iterations cannot even begin because the matrix ~, which is estimated at the initial estimates of the parameters, is not positive defInite (i.e., the matrix cannot be inverted, a necessary condition in order to obtain the correct estimates when maximum likelihood estimation is used). This situation will result in the following message: FAT - - -ALE - -R-R-0-R: Unable to start iterations because matrix SIGMA is not positive definite. Provide better starting values. • A poor model is detected at the admissibility checkpoint. In any single run, LISREL 8 allows the program to run through 20 iterations, at which point it conducts an admissibility check on the viability of the solution. The purpose of this procedure is to ensure that the factor loading matrices (Ax, Ay) are appropriate (e.g., there are
68
Chapter 2
no rows of zeros}, and that the variance-covariance matrices (Cl>, \II, ea. ee) are positive definite. H, after 20 iterations, the solution is found to be inadmissible, the iterations stop and the following error message is printed out:
IF A
TAL ERR 0 R: Admissibility test failed.1
Within the output file, this error message will be accompanied by a warning message that identifies which parameter matrix failed the admissibility check. Confronted with this message, you should check both the data and the model very carefully to see if they are as intended. Hthis is the case, it is possible to set the admissibility check at a higher number of iterations simply by entering the keyword AD=21 (or some other number greater than the default value of 20). Although LISREL provides for the user to set AD=OFF, Joreskog and SOrbom (1989) suggested that this only be done if you have intentionally specified fixed-zero values in the diagonals of the Cl>, \II, ea, or ee matrix. • The solution has not converged after the default number of iterations has been completed. 12 This situation typically arises from an inadequately or incorrectly specified model. Presented with such evidence of nonconvergence, the program stops iterating and one of the following two messages is printed out:
Iw A
R_N I_N G: The number of iterations exceeded XXi
F _ A _T _A_ L E _ R _ R _0_R : Serious problems encountered during minimization. Unable to continue iteration. Check your model and data. One highly possible explanation for this finding of nonconvergence is that the model is empirically underidentified in the sense that the information 13 matrix is nearly singular (i.e., it is close to be nonpositive definite). One way to check this possibility is to request the correlation matrix of the parameter estimates, which is accomplished by entering the keyword PC on the OU control line. This correlation matrix is an estimate of the asymptotic covariance matrix of the parameter estimates, albeit scaled to a correlation matrix. Very large correlations are indicative of a fit function that is nearly flat, which makes it impossible to obtain sufficiently good parameter esti-
12 The default value for number of iterations is three times the number of free parameters in the model. 13The probability limit for the matrix of second-order partial derivatives of the fit function with respect to the estimated parameter values {Hayduk, 1987}.
Using lISREL, PRELlS, and SIMPLIS
69
mates aoreskog & Sorbom, 1989). The typical approach to resolving this indeterminacy is to impose more constraints on the model. For example, more elements in the Ax matrix, the Ay matrix, or both may need to be fixed to zero.
WORKING WITH PRELIS 2 The PRELIS program was designed specifically to serve as a preprocessor for LISREL and hence the acronym PRElis. Nonetheless, because it can be used effectively to manipulate and save data files and to provide an initial descriptive overview of data, it can also function as a stand-alone program or in conjunction with other programs. The rationale underlying the development of PRELIS was to encourage researchers to screen their data well by providing a program capable of addressing the many problems that can arise in the collection of raw data. Thus, in addition to providing descriptive statistics and graphical displays of the data, PRELIS can prepare the correct matrix to be read by LISREL when the data (a) are continuous, ordinal, censored,t4 or any combination thereof, and (b) are severely nonnormally distributed, (c) have missing values, or both. The program can perform the imputation of missing values (i.e., the substitution of real values for those that are missing). In addition, PRELIS can be used for a variety of data-management functions that include, for example, the recoding and transformation of variables, selection of cases, computation of new variables, and creation and merger of raw data files. Finally, PRELIS can be used to conduct bootstrap sampling procedures. IS Accompanying various LISREL applications presented in this book, I illustrate the use of PRELIS in terms of data screening and use of the appropriate matrix to be analyzed by LISREL when the data are not all continuous variables, or are nonnormally distributed. Let us turn now to a review of the components of a PRELIS input file.
14Variables having a disproportionately high concentration of cases in either the upper or lower end of the distribution. One example of censored data are test scores exhibiting a "floor" effect (a large proportion of cases having no items correct), or a "ceiling effect" (a large proportion of cases having all items correct; Joreskog & Sorbom, 1996b). 15 A powerful strategy used in evaluating the replicability of findings whereby repeated random samples are drawn from the data, and the model estimated for each. (For an introduction to bootstrap methods, see Stine, 1989; for applications to multivariate analyses in general, see Thompson, 1995; for applications to covariance structures, see Yung & Bentler, 1996.)
70
Chapter 2
Creation of an Input File Consistent with LISREL 8, the PRELIS 2 input file is built upon the use of two-letter keywords that either initiate or comprise particular control lines. Essentially, the information in these control lines tells the program where to find the data, how to analyze the data, where to save the newly created matrices and what to print on the output file. Although there are only three required control lines, the following lines typically define the problem and must be structured in this order: • • • • •
Title (optional). DAta parameters (required). LAbels (optional). RAw data (required). One or more of the lines related to a particular data manipulation problem (optional). • OUtput (required).
Finally, as with LISREL, a given title for an input file can be entered on the first line without the need of a two-letter keyword, and any number of lines may be used. So, too, the program will read tith! lines until it encounters a line beginning with the letters uDA," uDa," udA," or uda", the title may not begin with a word that is spelled Uda- - - - -." Thus, in creating a PRELIS input file one can also avoid any possible problems by beginning each title line with an exclamation mark (!). Let us turn now to more detailed information related to the three required control lines (DA, RA, OU), as well as to the LA line and the variant data 16 management ones used in the specification of a problem.
Data Parameters The purpose of the DA line is to specify global characteristics of the raw data. Five different keywords are used in this definition process: NI - number of input variables. If this information is not provided, the program will terminate automatically. NO ... number of cases. If the sample size is unknown, then either omit this keyword, or set NO-O. MI - missing values. To specify several different values as indicators of missing values for all variables globally, then MI = a,b,c, should be included on the DA line 16lnformation related to the title is identical to that described for LISREL.
Using lISREL, PRElIS, and SIMPLIS
71
where a,b,c, represent the numerical values to be treated as missing values. However, if these values relate to only a specific set of variables, then the MI keyword should initiate a separate line as MI = a,b,c, .••• varlist where varlist indicates the variables to be included. Finally, if there is more than one missing value, these must be separated by commas; no comma appears following the last numeral. 17 TR - treatment of missing data where: TR = LI ,. listwise deletion. As such, all cases with missing values will be deleted, and computations based on only cases having values on all variables. TR=PA = pairwise deletion. For each pair of variables, computations will be based on all cases having complete data on both variables. Now let us take a look at a couple of DA line examples: IDA NI-25 NO=350 MI=8,9 TR-LII In this example, NI indicates that there are 25 variables and NO, that there are 350 cases. The MI keyword specifies that for all 25 variables, there are two values that should be regarded as missing values (8, 9). Finally, TR=LI indicates that listwise deletion of missing data should be invoked. DA NI-25 NO-350 TR-PA MI=8,9 ASC ESC In this example, the treatment of missing data is by means of pairwise deletion. Unlike the previous example, however, the missing values of 8 and 9 are to be used with only two variables (ASC, ESC). As such the MI keyword constitutes a separate line.
Labels Although the format for specifying label names is basically the same as it is in LISREL 8, PRELIS 2 allows the variable names to appear on the LA line accompanied by a semicolon. In addition, the program allows for the specification of categorical labels, consistent with the scale-point coding of the variables. For example, suppose the two variables in the previous example (ASC, ESC) were categorical variables having four scale points-l - strongly disagree (SD), 2 - disagree (0), 3 - agree, and 4 .. strongly agree (SA). These categorical labels can be entered using the keyword CL as follows: 17In
PRELIS 1, only one missing value could be defined.
Chapter 2
72
DA NI-4 NO-350 MI-9 TR-LI LA; ASC SSC PSC ESC CL ASC - ESC 1 ... S0 2-0 3-A 4-SA In this example, note first that there are four variables, the names of which appear on the LA line and separ;lted from the keyword by a semicolon. Second, the CL keyword indicates that for each variable, four categories of response values are possible, and their labels are as described earlier.
Raw Data This portion of the input file can comprise two aspects of raw data files: the filename and format of the original data file (required), and the scale type of the variables (optional). Here we review each of these topics separately.
Defining the Data File.
In contrast to LISREL, PRELIS can only read raw data, which must reside in an external file. If specification for the NO keyword on the DA line (see earlier discussion) were not entered, or entered as NO =0, the program will determine the sample size and print the number of cases on the output. However, if the NO specification is larger than the actual number of cases, the program will terminate the input when the last case has been read, and the correct sample size noted on the output; all computations are based on the correct sample size. On the other hand, if the NO specification is less than the number of actual cases, only that number will be read by the program. In specifying the raw data file to be used in the analyses, the line begins with the keyword RA, followed by the filename that houses the data and then FO, indicating that the fixed format follows; this procedure is identical to the one used in specifying the input data matrix to be analyzed in LISREL. In addition, however, PRELIS allows for an alternate setup that mimics that shown earlier for the LA line. As such, the fixed format statement can appear at the end of the RA line, preceded by a semicolon. The two possible setups are: RA FI-BDI.OAT FO (21F1.0) or as,
RA FI-BDI.OAT FO; (21F1.0)
Using LlSREL, PRELlS, and SIMPLIS
Defining the Variable Scales.
73
In general, the scaling of variables can be classified as being either continuous (i.e., response scores represent any value within the range of possible values) or ordinal (i.e., responses represent ordered categories); ordinal variables are either dichotomous (two categories) or polychotomous (more than two categories). Because social science data typically derive from attitudinal questionnaires or interviews that are structured in a Likert-type format, they are representative of an ordinal scale. Nonetheless, it has been common practice in SEM, if the number of response categories of a variable is greater than two, to treat these variables as if they were on a continuous scale. This practice has stemmed largely from the need for extremely large sample sizes and analyses that take the ordinality of the variables into account. Although there is some evidence that the more categories there are, the less severe the bias, Joreskog and Sorbom (1993a, 1993c) maintained that, regardless, these ordinal variables are not continuous variables and therefore should not be treated as such. They argued that because these variables have no metric, their means, variances, and covariances are not meaningful. As a consequence, subsequent analyses using LISREL may result in biased estimates. Thus, when data comprise variables that are ordinal, or of mixed scale types (ordinal and continuous), and are to be analyzed using LISREL, such analyses should be based on the polychoric or polyserial correlation matrix, respectively, using weighted least squares (WLS) estimation procedures. This recommendation applies also to the special case where the data are tetrachoric (both variables are dichotomous) or biserial (the ordinal variable is dichotomous). PRELIS allows the researcher to identify the scale type for each variable in the data. This classification is accomplished by means of a separate line that is initiated by the appropriate keyword for each scale type. If all variables have the same scale type, the keyword can be made equal to ALL; otherwise, the appropriate variable names are provided. Although PRELIS provides for the specification of continuous (CO), ordinal (OR), and censored variables (from above == CA; from below - CB; from above and below - CE), only the first two are considered in this book. (For details related to the other classifications, readers are referred to the manual-Joreskog & Sorbom, 1993c-and for an elaboration of censored variables, to Aldrich & Nelson, 1984). In contrast to the first version of the program, PRELIS 2 considers all variables to be of ordinal scale unless classified otherwise. More specifically, it will first determine the number of categories for each variable. For
74
Chapter 2
variables having less than 16 distinct values, PRELIS will classify them as being of ordinal scale; all others will be classified as continuous. Let us examine the following example. DA NI-4 NO-350 MI-9 TR-LI LA: ASC MSC ESC AA CL ASC - ESC 1-S0 2-0 3-A 4=SA RA FI-ACAOSC.OAT FO {3F1.0,X,lF2.1} OR ASC MSC ESC COAA Here we have four variables and, as indicated by the scale-type keywords, three are ordinal {ASC, MSC, ESq, and one is continuous {AA}. According to the CL keyword, each ordinal variable has four categories. The raw data are contained in a file called "ACADSC.OAT," and the format statement shows that the first three variables occupy one column each, followed by a space, with the last variable occupying two columns and having one decimal place.
Data Manipulation lines As noted earlier, PRELIS allows for a number of different operations involved in the manipulation of data, with each having its own particular keyword and command line. Under this rubric, we now examine how to invoke the recoding of variables, the computation of new variables, the imputation of missing values, and the selection of cases. Although PRELIS can also be used in the merging of files {which some may consider to be a data-manipulation activity}, this process results in the creation of a new raw data file and, thus, is reviewed in the section dealing with the topic of the OUtput command line. It is important to note that, in contrast to the first version of the program in which the order of data manipulation command lines was arbitrary, PRELIS 2 performs all operations according to the order in which they appear in the input file. As such, variable recoding, computation of new variables, and case selection {if requested} are performed immediately. If treatment of missing data (TR) has been indicated on the DA line, this operation is completed after the aforementioned data manipulations have been conducted. However, if the imputation of missing values is requested,
Using LlSREL, PRELlS, and SIMPLIS
75
this operation precedes the treatment of missing data. More specifically, if TR is not specified (i.e., thus the default method of listwise deletion will be invoked) or TR=LI appears on the DA line, all cases having missing values after imputation has been performed are then deleted from further analyses. On the other hand, if TR = P A has been specified, no cases will be deleted. Finally, important information related to the printing of statistics needs to be noted. If listwise deletion of missing data has been specified, then both univariate and multivariate statistics will be reported. However, if pairwise deletion has been reported, no multivariate statistics will be reported and all univariate statistics will be based on all cases for which data are available on all variables. We turn now to our first data-manipulation procedure-the recoding of variables.
Variable Recode line.
In the previous example, the ordinal variables had categories ranging from a low of 1 (strongly disagree) to a high of 4 (strongly agree). Suppose that you wished to reverse this order so that a scale point of "1" represented strong agreement and a "4" represented strong disagreement. This is easily accomplished in PRELIS 2 using the RE keyword: IRE ASC - ESC OLD-l,2,3,4 NEW-4,3,2,11
New Variable Computation line.
Sometimes it is of interest to create new variables from old variables through various computations. Again, this task is easily performed with PRELIS 2 using the keyword NE; each new variable must be defined on a separate line. In addition to the + and - symbols, multiplication is indicated with an *, and exponentiation by a **. Neither division nor the use of parentheses are permitted. Drawing from the previous example, let us suppose that we wished to compute two new variables: (a) a global self-concept variable comprising the three specific self-concept variables academic (ASC), math (MSC), and English (ESC), and (b) a revised academic achievement (AA) variable that represents double the original value. The specification for these computed variables would be: NE GSC - ASC + MSC + ESC NEAAD-AA*2
76
Chapter 2
Case Selection Line.
For a variety of reasons, you may wish to have analyses based on only certain cases. For example, you may want to randomly split the data into two for purposes of cross-validation,18 or to split the data by gender for comparative analyses; the resulting data can be saved in another raw data file (to be illustrated later). PRELIS 2 permits selection of cases based on the SC keyword and command line; the keyword is followed by a particular casecondition, where casecondition represents one of the following: • • • •
SC case < 300 SC case > 150 SC case - odd SC case - even
(the first 299 cases selected). (all cases numbered above 150 selected). (all odd-numbered cases selected). (all even-numbered cases selected).
In addition, the casecondition can be used in combination with conditions on the values of the variables as follows: • SC casecondition. • SC varlist conditionlist.
As such, this combination of SC lines selects all cases that satisfy both the casecondition and all conditions in the conditionlist for the variables in the varlist. The following example illustrates this situation: SC case-odd SCGENDER -2 Given that the variable, GENDER, is coded "1" for males and "2" for females, these two command lines signify that the analyses are to be based on all females having odd case numbers. One thing to notice carefully in this example is the space between GENDER, which represents the varlist, and os 2, which represents the condition list.
Imputation of Missing Values Line. In our earlier discussion of the DAta parameter section of the input, we noted that PRELIS allows for either listwise or pairwise deletion of missing data. However, in many 18A random subsample of any size can also be obtained by means of the bootstrap sampling procedure performed by PRELIS. However, this topic goes beyond the intent of the present book.
Using LlSREL, PRELlS, and SIMPLIS
77
situations, where data may not be missing completely at random (see Little & Rubin, 1987), these two choices may not be optimal. Thus, PRELIS offers a third option for handling missing data-the imputation of missing values using the 1M keyword. These values are obtained from another case in the data exhibiting a similar response pattern over a set of variables. (For a complete explanation of this method of imputation, see Joreskog & Sorbom, 1993c.) As noted earlier, imputation should be performed after all recodings and other data transformations have been performed. If, however, one must carry out these data manipulations on the imputed data, this will require the execution of two PRELIS runs. In the first run, the imputation command is invoked and the resulting raw data matrix saved. These data are then used for a second run in which the recodings, any other transformations, or both are performed. An example of the imputation command is as follows where the parenthesized variables represent the set of variables on which the match will be performed:
\ 1M (ASe sse pse ESC) (ASe sse pse ESC)\
Output Specification One of the most important functions of PRELIS is its capacity to prepare other external data files that can be read by LISREL. In particular, these relate to the creation of: (a) raw data files as a consequence of either the merging of two or more files, or the building of a subset of data from a larger data file; (b) a polychoric correlation matrix based on ordinal variables; (c) a polyserial correlation matrix based on a combination of ordinal and continuous variables; (d) a product-moment Pearson correlation matrix based on coninuous variables; or (e) an augmented moment matrix used in the estimation of an asymptotic covariance or correlation matrix. Keywords related to both the type of matrix to be computed, and the name of the data file to be saved, are entered on the OU command line. We turn now to a description of the creation and saving of raw data files, correlation matrix files, and the asymtotic covariance matrix file.
Raw Data Files. As noted earlier, raw data files can be created in one of two ways: (a) the merging of two or more files, and (b) the construction of a subset of variables from a larger data set. Let us first examine the merged data file.
78
Chapter 2
Creating a subset data file. If the original data set is very large, it is often of interest to create a smaller data file as a subset of the original. This is easily accomplished using PRE LIS by simply stating the name of the data file with keyword RA on the OU line; also required is a specification of the width of the format (i.e., number of columns-WI) and number of decimal places (ND). In the following example, the data file "ACADSC.DAT" was created as a subset of the original data set called "SC.DAT." Moreover, in the newly created data file, all variables are one column wide and there are no decimal places. In addition, note the two' alternative formats in the specification of the variable labels. DA NI-3 NO=350 LA; ASC MSC ESC RA-SC.DATFO (3F1.0) OU RA=ACADSC.DAT WI= 1 ND=O
Creating a merged data file.
There are times when you may wish to merge data into one file. For example, you may have scores on a particular instrument in separate files for males and females. However, for certain analyses, you may wish to have a global picture of the measured scores. In this case, you would want to merge the files into one. PRELIS allows you to perform this task simply by noting the two data filenames on the RA line, followed by the fixed format for each file on two separate lines. This process is illustrated in the following example: DA NI-3,3 NO-350,259 LA ASC MSC ESC RA-ACADSCM.DAT, ACADSCF.DAT FO (3F1.0) (3F1.0) OURA""ACADSC.DATWI=l ND-O
Polychoric (Polyserial) Correlation Matrix Files.
The polychoric correlation matrix is identified in PRELIS by the keyword PM. Because PRE LIS automatically computes the correct correlation matrix according to whether the variables are all ordinal (dichotomous, polychoto-
Using LlSREL, PRELlS, and SIMPLIS
79
mous) or of mixed scale (ordinal, continuous), the PM matrix can be one of the following types: polychoric, tetrachoric, polyserial, or biserial. The polychoric correlation matrix can be computed and saved in a separate file using PRELIS; the keyword for saving a matrix is SM. An example of this procedure is: DA NI-4 NO-350 MI-9 TR-LI LA; ASC MSC ESC AA CL ASC - ESC 1-SD 2-D 3-A 4-SA RA FI-ACADSC.DAT FO (3F1.0,X,1F2.1) OR ASC MSC ESC COAA OU MA-PM SM-ACADSC.DAT The OU line in this example then requests that the polychoric correlation matrix be computed and that the resulting matrix should be saved in a file called" ACADSC.DAT." This file can then be used in the analysis of the data using LISREL. However, analyses based on this polychoric matrix must be based on WLS estimation which, in turn, assumes an asymptotic covariance matrix. Thus, in addition to saving the polychoric matrix, one must remember to also save the asymptotic covariance matrix, a topic to which we turn next.
Asymptotic Covariance Matrix Files. The asymptotic (largesample) covariance matrix represents the estimated sample variances and covariances under arbitrary nonnormal distributions (see Browne, 1982, 1984a). This matrix, in turn, is used to compute a weight matrix for WLS estimation in subsequent analyses based on LISREL. Of prime importance to readers of this book are two major uses of the asymptotic covariance matrix with respect to SEM. First, as noted earlier, when the polychoric matrix is to be analyzed in LISREL, the estimation of parameters must be based on the WLS method that, in turn, requires the asymptotic covariance matrix; in other words, the analysis is based on two matrices-the polychoric matrix and the asymptotic covariance matrix. Second, when the data comprise continuous variables, albeit they are nonnormally distributed (i.e., there is strong evidence of high skewness, kurtosis, or both), estimation procedures should be based on WLS. In this instance, the two required matrices are the asymptotic covariance matrix and the covariance matrix.
80
Chapter 2
The keyword for the asymptotic covariance matrix is AC, and for saving this matrix, SA. Two example input files follow; the first is relevant to the situation where the variables are either ordinal or of mixed scale types, whereas the second addresses the problem of nonnormally distributed data. Let us look now at the first example. DA NI-4 NO-350 LA; ASC MSC ESC AA CL ASC - ESC 1-SD 2-D 3-A 4-SA RA FI-ACADSC.DAT FO (3F1.0,X,1F2.1) OR ASC MSC ESC COAA OU MA-PM SM-ACADSC.PM SA-ACADSC.AC Focusing on the OU line, we see that computation of the polychoric matrix is requested (MA=PM) and that it is to be saved in a file called "ACADSC.PM." In addition, computation of the asymptotic covariance matrix has been requested (SA) and it is to be saved as "ACADSC.AC." At this point you are likely wondering why the keyword AC was not not used in this input file. In actual fact, the SA keyword has already requested that the AC matrix be saved; the keyword itself only comes into use in structuring the LISREL input file where analyses are to be based on the asymptotic matrix. We turn now to the second example. DA NI-4 NO-350 LA; MATH ENG SC AA RA FI=MARKS.DAT FO (4F2.0) CO All OU MA-CM SM-MARKS.CM SA ... MARKS.AC In this example, first of all, we note that all variables are continuous, as indicated by the keyword CO, followed by the word" ALL." The OU line indicates that the covariance matrix is to be computed (MA = CM) and then saved (SM) as the file "MARKS.CM." Finally, the asymptotic covariance matrix is also to be saved (SA); the name of this file will be "MARKS.AC." It is perhaps worthy of note that the file name extension (i.e., the dot and three letters following the initial stem of the file name) is purely arbitrary
Using LlSREl, PRELlS, and SIMPLIS
81
and does not have to be consistent with the matrix computed. Thus, the names used in the examples offered earlier reflect my own preference in the naming of such files.
WORKING WITH SIMPLIS SIMPLIS is a new command language that was born out of the call by many researchers to simplify both the creation of LISREL input files and the manner by which results are reported in the output file; hence the acronym SIMPlis. In sharp contrast to the complex LISREL lexicon and formulation of control lines, SIMPLIS is extremely easy to use. The major difference between the two command languages is that SIMPLIS commands are formulated using the everyday language of English and there is no need to comprehend either the various LISREL models and sub models or Greek and matrix notations. As a consequence, anyone knowing how to formulate an hypothesized model as a path diagram will experience no difficulty whatsoever in using the SIMPLIS language immediately. In their attempt to strip the technical jargon from the original LISREL program, Joreskog and Sorbom (1993c) argued that the SIMPLIS language "shifts the focus away from the technical question of How to do it so that researchers can concentrate on the more substantively interesting question, What does it all mean?" (p. i). Despite this avenue to simplified model specification and analytic results, however, Joreskog and Sorbom underscored the need for a thorough understanding of the substantive framework within which a model is grounded, as well as a sufficient knowledge base on which to make informed decisions regarding both appropriate analyses and interpretations of findings. I strongly concur with this important caveat and, indeed, cannot overemphasize the need for users of the program to have a sound understanding of the basic principles associated with SEM (see chapters 1,3-12), as well as the specification of models that are firmly linked to theory, replicated empirical research, or both. With three exceptions related to advanced applications of SEM,19 any model that can be specified for use in LISREL (including multigroup models),
19These applications are linear and nonlinear constraints, general covariance structures, and interval restrictions (see Joreskog & SOrbom, 1993b).
82
Chapter 2
can be specifed using the SIMPLIS language. Furthermore, estimation of SIMPLIS-specified models can be based on any of the methods offered by LISREL. In addition, the Windows version of LISREL 8 allows for the creation of high-quality path diagrams of models specified in either command language; parameter estimates, t values (statistical significance and nonsignificance being displayed in different colors), modification indices, and expected parameter change statistics can be incorporated into the diagram. (These terms are fully explained and illustrated in chapter 3.) Indeed, one of the most advanced options available using SIMPLIS within the Windows environment is the opportunity to modify a model's specification of free (or fixed) parameters "on screen" by means of a pointing, clicking, and dragging sequence using the mouse. A pull-down menu then allows the user the option to reestimate and display the modified model. These features of SIMPLIS are illustrated in chapter 12. Aside from the differential syntax used in creating SIMPLIS and LISREL input files, there is a major dissimilarity in the structuring of the output files from the two programs. Whereas this output file for LISREL specifications is presented in matrix form (consistent with the input file), the SIMPLIS output file is presented in equation form. (One example of a SIMPLIS output file is presented in chapter 3.) It is important to note, however, that although SIMPLIS input files can request that the output be presented either in SIMPLIS or in LISREL format, LISREL input files can only generate the standard LISREL output. Finally, in building input files, one must use either the SIMPLIS command language or the LISREL language; the two cannot be mixed in the same input file.
Creation of an Input File Because applications presented in Chapters 3 through 12 illustrate various SIMPLIS input files, and the three example LISREL input files reviewed in the first section of this chapter are restructured here using SIMPLIS language, only a cursory summary of the major components of SIMPLIS files is now presented. Within the framework of CF A and full SEM, SIMPLIS input files can typically be decomposed into six sections: title, observed variables, form of input data, number of cases, unobserved variables, and model structure to be tested. In addition, a number of options are available to the user; these include method of estimation, number of decimals to be printed, number of
Using LlSREl, PRELlS, and SIMPLIS
83
iterations to be used, a path diagram, and LISREL output. The input of specification information accompanies an appropriate header (two-letter keywords are not used in SIMPLIS) either on the same line (separated by a space or colon), or on the following line (following a carriage return); although the colon can be used in either case for sake of neatness, it is not a compulsory component of the input file. We now examine the key command headers associated with each of these component parts of the SIMPLIS input file.
Title As was true for both LISREL and PRELIS, titles in SIMPLIS files are optional, albeit strongly recommended. Again, use of the exclamation mark (!) at the beginning of each title line will prevent any problems of related to the use of syntax terminology.
Observed Variables Names of the observed variables in the data are presented in free format immediately following the title. Again, it is important to note that (a) these names must appear in the order in which they appear in the data, and (b) the program only recognizes the first eight letters of any variable name. The listing of these variables names follow one of two headers: LABELS or OBSERVED VARIABLES.
Form of Input Data Information related to the input data must follow next. Based on what you have already learned from our review of both the LISREL and PRELIS programs, you will not be surprised to see that the headers for this section are one of the following: RA W DATA, COVARIANCE MATRIX, CORRELATION MATRIX, and ASYMPTOTIC COVARIANCE MATRIX. In addition, if means and standard deviations are to be included in the file, the headers are, appropriately, MEANS and STANDARD DEVIATIONS, respectively. H any data are to be embedded in the input file, these values are entered next. On the other hand, if they reside in an external file, they are followed by the word FROM filename.ext (e.g., "ACADSC.DAT"). One important difference between SIMPLIS and LISREL input files relates to the specification of fixed format statement. Whereas this specification can be included in the LISREL input file, this is not so for SIMPLIS. Rather, the format statement must appear as the sole entry on the first line of the data file.
84
Chapter 2
Number of Cases This information is provided following the header SAMPLE SIZE. The numeral can be entered on the same line following an equal sign, a colon, or a space; it can also be entered on the next line.
Unobserved Variables The names of these variables are presented next following the use of one of two headers: UNOBSERVED VARIABLES or LATENT VARIABLES.
Model Structure to Be Tested This section of the input file specifies all relations involving observed and unobserved variables in the model. These include, for example, the regression of observed variables on unobserved factors, the regression of one latent factor on another, correlations between factors, correlations between error terms, and the like. These relations comprising the model structure are presented under one of three possible headers: RELATIONSHIPS, EQUATIONS, or PATHS. Finally, the end of the file can be noted using the optional header, END OF PROBLEM. Because the three examples to be presented next-and the many applications presented throughout the remainder of the book-can be used to demonstrate the various optional input noted earlier, they are not described here. Let us thus move on, once again to study the path diagrams presented in Figs. 1.6, 1.7, and 1.8, albeit this time to review how they are specified within the context of a SIMPLIS input file. Because much of the material presented in these files will now seem familiar to you, I only highlight certain aspects of them. We turn first to Fig. 1.6, the schematic representation of a first-order CFA model; the input file is presented in Table 2.5. In reviewing this input file, at least five points are worthy of note. First, as was the case with the LISREL input file, the data are in the form of RA W DATA and these two words serve as the command header that passes this information along to the program. Second, note that there is no parenthesized format statement. Third, the easiest way to formulate the equations is to first observe all variables having arrows pointing at them; these variables constitute the dependent variables in the model as they are influenced by other variables in the model (for further elaboration of this point, see chapter 1). In transposing this information to the input file, simply list all of these
Using L1SREL, PRELlS, and SIMPLIS
85
Table 2.5 SIMPLIS Input for Hypothesized First-order CFA Model ! 1st-Order CFA of 4-Factor Model of Self-concept ! Initial Model
OBSERVED VARIABLES: SDQASCl SDQASC2 SDQASC3 SDQSSCl SDQSSC2 SDQSSC3 SDQPSCl SDQPSC2 SDQPSC3 SDQESCl SDQESC2 SDQESC3 RAW DATA from FILE CFASC.DAT SAMPLE SIZE: 250 LATENT VARIABLES: ASC SSC PSC ESC EQUATIONS: l*ASC SDQASCl SDQASC2 ASC SDQASC3 = ASC SDQSSCl = l*SSC SDQSSC2 = SSC SSC SDQSSC3 = SDQPSCl = l*PSC SDQPSC2 PSC SDQPSC3 PSC SDQESCl = l*ESC SDQESC2 = ESC SDQESC3 = ESC END OF PROBLEH
dependent variables on the left hand side and then make them equal to all variables that impact on them. Thus, the statement "SDQASC2 = ASC" indicates that SDQASC2 is caused by ASC. Fourth, As indicated in Figure 1.6, the first of each set of indicators is fixed to 1.00 for for purposes of both statistical identification and scaling of the related latent factors. This constraint, in a SIMPLIS file, is specified by placing a 1, followed by an asterisk (*) prior to the latent factor. Essentially, this means that SDQASC1, for example, is caused by 1 times the latent variable ASC. Finally, you will observe no specification in the input file for the latent factor correlations. This is because in SIMPLIS correlations among the independent factors are default. Had we wished to specify no correlation among any of the factors, this would have been accomplished simply by setting the particular correlation to zero. Let us turn our attention now to Fig. 1.7, which depicts a second-order CFA model. The SIMPLIS input file for this model is presented in Table 2.6. As you will note, this file is almost identical to the previous one (Table 2.5), but not quite. Because each of the first-order factors operates as a dependent factor as a consequence of being influenced by GSC, two impor-
Chapter 2
86
tant specification differences arise. First, to reflect this effect, each of the first order factors is specified as being caused by GSC; for variation, I have included the command, PATHS, and indicated the impact of GSC on each of the lower order factors by means of arrows (~). Second, for purposes of statistical identification, as noted in chapter 1 and earlier in this chapter, the variance of GSC has been fixed to 1.00. Finally, Fig. 1.8 portrays a full structural equation model, and its related SIMPLIS input file is displayed in Table 2.7. There are three important aspects of this file that are worthy of note. First, because we have both independent and dependent variables in this model, and because the entry of dependent variables must be read first by the program, the natural order of the variables (as they appear in the data set) must necessarily be reordered. However, in contrast to the LISREL input file, in which this rearrangement was accomplished by means of the Selection command line, the program's distributor (Scientific Software Interna-
Table 2.6 SIMPLIS Input for Hypothesized Second-order CFA Model ! 2nd-Order CFA of 5-Factor Model of Self-concept ! Initial Model
OBSERVBD VARIABLBS: SDQASC1 SDQASC2 SDQASC3 SDQSSC1 SDQSSC2 SDQSSC3 SDQPSC1 SDQPSC2 SDQPSC3 SDQESC1 SDQESC2 SDQESC3 RAW DATA from FILE CFASC.DAT SAMPLB SIZB: 250 LATBNT VARIABLES: ASC SSC PSC ESC EQUATIONS: SDQASC1 1*ASC SDQASC2 ASC SDQASC3 ASC SDQSSC1 1*SSC SDQSSC2 SSC SDQSSC3 SSC SDQPSC1 = l*PSC SDQPSC2 = PSC SDQPSC3 PSC SDQESC1 1*ESC SDQESC2 = ESC SDQESC3 = ESC PATHS: GSC -> ASC GSC -> SSC GSC -> PSC GSC -> ESC SET the variance of GSC to 1.00 END OF PROBLEM
Using lISREl, PRElIS, and SIMPLIS
87
Table 2.7 SIMPLIS Input for Hypothesized Full structural Equation Model ! Impact of Social Self-concept on Self-confidence ! Initial Model
OBSERVED VARIABLES:
SDQSSCFl SDQSSCF2 SDQSSCF3 SDQSSCSl SDQSSCS2 SDQSSCS3 SDQSSCl SDQSSC2 SDQSSC3 SCONSl SCONS2 RAW DATA from FILE SSCCON.DAT
SAMPLE SIZE: 250 LATENT VARIABLES:
SSC SCONF SSCF SSCS EQUATIONS:
SDQSSCl l*SSC SDQSSC2 SSC SDQSSC3 SSC SCONSl l*SCONF SCONS2 SCONF SDQSSCFl l*SSCF SDQSSCF2 SSCF SDQSSCF3 SSCF SDQSSCSl l*SSCS SDQSSCS2 = SSCS SDQSSCS3 = SSCS SSC = SSCF SSCS SCONF = SSC
END OF PROBLEM
tional) claims that no such command is necessary in a SIMPLIS input file; the program automatically reads the variables in the Equations statements. 20 Second, consistent with the reordered entry of observed variables into the program, the latent variables are similarly entered, as specified under the Latent Variables command. Finally, note that consistent with the model depicted in Fig. 1.8, the equations for the structural part of the model differ. Whereas the latent variable sse is hypothesized to be influenced by two factors (SSe = sseF SSeS), the latent variable, SeONF, is influenced by only one other factor (SeONF = SSe).
OVERVIEW OF REMAINING CHAPTERS By now, you are likely feeling relatively comfortable with the syntax and input file setups using LISREL, PRELIS, and SIMPLIS, and are no doubt anxious to move on to the "real thing." Although I have devoted a fair 2'TIespite this claim, however, I agree with Mueller (1996) that the addition of the Reorder command very often improves the readability of the output, and particularly so, as it relates to the factor loading estimates. A prime example is the output related to multitrait-mul· timethod models (see chapter 6, this volume).
88
Chapter 2
amount of time to introducing you to the bricks and mortar of SEM applications using the LISREL family of programs and command languages, I firmly believe, and hope you agree, that it was time well spent. Indeed, it is very important to get a solid grasp of the fundamentals before launching into actual applications. Therefore, let us review what you have learned thus far; in summary, this information includes: (a) concepts underlying SEM procedures, (b) components of the CFA and full SEM models, (c) elements and structure of the LISREL and PRELIS programs, (d) creation of LISREL input files using both the standard LISREL and more recent SIMPLIS command languages, and (e) the primary functions of the PRELIS program, and creation of its input files. Now it is time to see how these bricks and mortar can be combined to build a variety of modeling structures. The remainder of the book, therefore, is devoted to an in-depth examination of basic LISREL applications involving both CFA and full SEM models; one sample SIMPLIS input file accompanies each application. Although all interpretations of output files are based on the LISREL format, a sample SIMPLIS output file is illustrated in chapter 3. Finally, the various features of, and functions performed by PRELIS are illustrated for all applications based on single-group analyses. In selecting the applications presented in this book, I have endeavored to address the diverse needs of my readers by including a potpourri of LISREL and PRELIS setups. As such, some applications will have the data matrix embedded within the Input file, whereas others will draw from an external file; some will include data in the form of a correlation matrix, whereas others will involve the input of raw data; some will include start values, and others will not. Taken together, the applications presented in the next 10 chapters should provide you with a comprehensive understanding of the various uses that can be made of both the LISREL and PRELIS programs, as well as the ease with which the alternate SIMPLIS command language can be used. Let us move on, then, to our first application.
PART
TWO
Single-Group Analyses
Confirmatory Factor Analytic Models CHAPTER 3: APPLICATON 1 Testing the Factorial Validity ofa Theoretical Construct {First-Order CFA Model}
CHAPTER 4: APPLICATION 2 Testing the Factorial Validity ofScores From a Measuring Instrument {First-Order CFA Model}
CHAPTER 5: APPLICATION 3 Testing the Factorial Validity of Scores From a Measuring Instrument (Second-Order CFA Model)
CHAPTER 6: APPLICATION 4 Testingfor Construct Validity: The Multitrait-Multimethod (MTMM) Model
The Full latent Variable Model CHAPTER 7: APPLICA nON 5 Testing the Validity ofa Causal Structure
CHAPTER
3
APPLICATION 1: Testing the Factorial Validity of a Theoretical Construct (First-Order CFA Model) Our first application examines a first-order CFA model designed to test the multidimensionality of a theoretical construct. Specifically, this application tests the hypothesis that self .50. Turning to Table 3.5, we see that the RMSEA value for our hypothesized model is .048, with the 90% confidence interval ranging from .034 to .062 and the p-value for the test of closeness of fit equal to .56. Interpretation of the confidence interval indicates that, over all possible randomly sampled RMSEA values, 90% of them will fall within the bounds of .034 and .062, which represents a good degree of precision. Given that (a) the RMSEA point estimate is < .05 (.048); (b) the upper bound of the 90% interval is .06, which is less than the value suggested by Browne and Cudeck (1993); and (c) the probability value associated with this test of close fit is > .50 (p = .56), we can conclude that the initially hypothesized model fits the data well. 9 Before leaving this discussion of the RMSEA, it is important to note that confidence intervals can be influenced seriously by sample size, as well as model complexity (MacCallum et al., 1996). For example, if sample size is small and the number of estimated parameters is large, the confidence interval will be wide. Given a complex model (i.e., a large number of estimated parameters), a very large sample size would be required in order to obtain a reasonably narrow confidence interval. On the other hand, if the number of parameters is small, then the probability of obtaining a narrow confidence interval is high, even for samples of rather moderate size. Turning to the next cluster of statistics, we see values related to the Expected Cross-Validation Index (ECVI). The ECVI was proposed as a means to assessing, in a single sample, the likelihood that the model cross-validates across similarsized samples from the same population (Browne & Cudeck, 1989). Specifically, it measures the discrepancy between the fitted covariance matrix in the analyzed sample, and the expected covariance matrix that would be obtained in another sample of equivalent size. Application of the ECVI assumes a comparison of models whereby an ECVI index is computed for each model and then all ECVI values placed in rank order; the model having the smallest ECVI value exhibits 90ne possible limitation of the RMSEA. as noted by Mulaik (see Byrne. 1994b). is that it ignores the complexity of the model.
114
Chapter 3
the greatest potential for replication. Because ECVI coefficients can take on any value, there is no determined appropriate range of values. In assessing our hypothesized four-factor model, we compare its ECVI value of .89 with that of both the saturated model (ECVI - 1.03) and the independence model (ECVI - 6.55). To conceptualize clearly the comparison of these three models, think of them as representing points on a continuum, with the independence model at one extreme, the saturated model at the other extreme, and the hypothesized model somewhere in between. The independence model is one of complete independence of all variables in the model (i.e., in which all correlations among variables are zero) and is the most restricted. The saturated model, on the other hand, is one in which the number of estimated parameters equals the number of data points (i.e., variances and covariances of the observed variables, as in the case of the just-identified model), and is the least restricted. Given the lower ECVI value for the hypothesized model, compared with both the independence and saturated models, we conclude that it represents the best fit to the data. Beyond this comparison, Browne and Cudeck (1993) have shown that it is now possible to take the precision of the estimated ECVI value into account through the formulation of confidence intervals. Turning to Table 3.5 again, we see that this interval ranges from .77 to 1.03. Taken together, these results suggest that the hypothesized model is well fitting and represents a reasonable approximation to the population. Moving down to the next set of indices (see Table 3.5), we seethe -l statistic reported for the test of the Bentler and Bonett (1980) Independence Model (described earlier). The independence model (also known as the null model) is used in the computation of the NFl, NNFI (Bentler & Bonett, 1980) and CFI (Bentler, 1990) indices of fit to be described later. In large samples, this model serves as a good baseline against which to compare alternative models for purposes of evaluating the gain in improved fit. Given a sound hypothesized model, one would naturally expect the -l value for the null model to be extremely high, thereby indicating excessive malfit; such is the case with the present example ('l(120) - 1696.73). Thus, relative to the null model, our hypothesized four-factor model represents a substantially better fit to the data (-l (98) - 158.51). On the other hand, had the fit of the four-factor model been close to that of the null model, this would have raised serious questions about the soundness of the hypothesized model. Included in the same cluster as the independence -l are goodness-of-fit statistics resulting from the evaluation of the independence (or null), the
Application 1
115
hypothesized, and the saturated models, based on Akaike's {1987} Information Criterion (AIC) and Bozdogan's {1987} consistent version of the AIC (CAlC). These criteria address the issue of parsimony in the assessment of model fit; as such, statistical goodness-of-fit, as well as number of estimated parameters, are taken into account. Bozdogan, however, noted that the AIC carried a penalty only as it related to degrees of freedom {thereby reflecting the number estimated parameters in the model}, and not to sample size. Presented with factor analytic findings that revealed the AIC to yield asymptotically inconsistent estimates, he proposed the CAlC, which takes sample size into account {Bandalos, 1993}. As with the ECVI, the AIC and CAlC are used in the comparison of two or more models with smaller values representing a better fit of the hypothesized model (Hu & Bentler, 1995). These three indices also share the same conceptual framework; as such, the AIC and CAlC reflect the extent to which parameter estimates from the original sample will cross-validate in future samples {Bandalos, 1993}. Turning to the output once again, we see that both the AIC and CAlC statistics for the hypothesized model are substantially smaller than they are for either the independence or the saturated models. Let us move on now to the next group of statistics. The Root Mean Square Residual (RMR) represents the average residual value derived from the fitting of the variance-covariance matrix for the hypothesized model {SQ} to the variance-covariance matrix of the sample data {S}. However, because these residuals are relative to the sizes of the observed variances and covariances, they are difficult to interpret. Thus, they are best interpreted in the metric of the correlation matrix (Hu & Bentler, 1995; Joreskog & Sorbom, 1989). The standardized RMR therefore represents the average value across all standardized residuals, and ranges from zero to 1.00; in a well-fitting model this value will be small-say, .05 or less. The value of .048 shown in Table 3.5 represents the average discrepancy between the sample observed and hypothesized correlation matrices and can be interpreted as meaning that the model explains the correlations to within an average error of .048 {see Hu & Bentler, 1995}. The Goodness-of-Fit Index (GFI) is a measure of the relative amount of variance and covariance in S that is jointly explained by S. The AGFI differs from the GFI only in the fact that it adjusts for the number of degrees of freedom in the specified model. As such, it also addresses the issue of
116
Chapter 3
parsimony by incorporating a penalty for the inclusion of additional parameters. The GFI and AGFI can be classified as absolute indices of fit because they basically compare the hypothesized model with no model at all (see Hu & Bentler, 1995). Although both indices range from zero to 1.00, with values close to 1.00 being indicative of good fit, Joreskog and Sorbom (1993c) noted that, theoretically, it is possible for them to be negative. This, of course, should not occur, as it would reflect the fact that the model fits worse than no model at all. Based on the GFI and AGFI values reported in Table 3.5 (.93 and .91, respectively) we can once again conclude that our hypothesized model fits the sample data fairly well. The last index of fit in this group, the Parsimony Goodness-of-Fit Index (pGFI), was introduced by James, Mulaik, and Brett (1982) to address the issue of parsimony in SEM. As the first of a series of "parsimony-based indices of fit" (see Williams & Holahan, 1994), the PGFI takes into account the complexity (i.e., number of estimated parameters) of the hypothesized model in the assessment of overall model fit. As such, "two logically interdependent pieces of information," the goodness-of-fit of the model (as measured by the GFI} and the parsimony of the model, are represented by a single index (the PGFI), thereby providing a more realistic evaluation of the hypothesized model (Mulaik et aI., 1989, p. 439). Typically, parsimony-based indices have lower values than the threshold level generally perceived as "acceptable" for other normed indices of fit. Mulaik et aI. suggested that nonsignificant statistics and goodness-of-fit indices in the range of .90, accompanied by parsimonious-fit indices in the range of .50, are not unexpected. Thus, our finding of a PGFI value of .67 would seem to be consistent with our previous fit statistics. We turn now to the next set of goodness-of-fit statistics which, with the exception of the PNFI, can be classified as incremental or comparative indices of fit (Hu & Bentler, 1995; Marsh et aI., 1988). As with the GFI and AGFI, incremental indices of fit are based on a comparison of the hypothesized model against some standard. However, whereas this standard represents a baseline model (typically the independence or null model noted earlier) 10 for the incremental indices, it represents no model at all for the GFI and AGFI. We now review the incremental indices first, followed by the PNFI.
1:
IOpor alternate approaches to formulating baseline models, see Cudeck and Browne (1983); Sobel and Bohrnstedt (1985).
Application 1
117
For the better part of a decade, Bentler and Bonett's (1980) Normed Fit Index (NFl) has been the practical criterion of choice, as evidenced in large part by the current "classic" status of its original paper (see Bentler, 1992; Bentler & Bonett, 1987). However, addressing evidence that the NFl has shown a tendency to underestimate fit in small samples, Bentler (1990) revised the NFl to take sample size into account and proposed the Comparative Fit Index (CFI). Values for both the NFl and CFI range from zero to 1.00 and are derived from the comparison of an hypothesized model with the independence model, as described earlier. As such, each provides a measure of complete covariation in the data, a value > .90 indicating an acceptable fit to the data (Bentler, 1992). The NNFI, like the AGFI, takes the complexity of the model into account in the comparison of the hypothesized model with the independence model. However, because the NNFI is not normed, its value can extend beyond the range of zero to 1.00 and, thus, is difficult to interpret. Although these three practical indices of fit are reported in the LISREL output, Bentler (1990) has suggested that the CFI should be the index of choice. As shown in Table 3.5, both the NFl (.91) and CFI (.96) were consistent in suggesting that the hypothesized model represented an adequate fit to the data. The Incremental Index of Fit (IFI) was developed by Bollen (1989a) to address the issues of parsimony and sample size that were known to be associated with the NFL As such, its computation is basically the same as the NFl, except that degrees of freedom are taken into account. Thus, it is not surprising that our finding of IFI - .96 is consistent with that of the CFI in reflecting a well-fitting model. Finally, the Relative Fit Index (RFI; McDonald & Marsh, 1990), also known as the Relative Noncentrality Index (RNI), is algebraically equivalent to the CFI in most SEM applications (Goffin, 1993); as with the CFI, coefficient values range from zero to 1.00, with higher values indicating superior fit. As was the case for the PGFI, the Parsimony Normed Fit Index (PNFI) addresses the issue of parsimony; as such, it takes the complexity of the model into account in its assessment of goodness-of-fit. In contrast to the PGFI, however, the PNFI is tied to the NFl in the sense that the NFl (rather than the GFI) is multipled by the parsimony ratio (see James et aI., 1982; Mulaik et aI., 1989). Again, a PNFIof.74 (see Table 3.7) falls in the range of expected values. It It More recently, the parsimony index has been used in conjunction with the CFI to yield the PCFI (see, e.g., Byrne, 1994b; Carlson & Mulaik, 1993; Williams & Holahan, 1994).
118
Chapter 3
The last goodness-of-fit statistic that appears on the LISREL 8 output is Hoelter's (1983) Critical N (CN). It differs substantially from those previously discussed in that it focuses directly on the adequacy of sample size, rather than on model fit. Development of the CN arose from an attempt to find a fit index that is independent of sample size. Specifically, the purpose of the CN is to estimate a sample size that would be sufficient to yield an adequate model fit for a x2test (Hu & Bentler, 1995). Hoelter (1983) proposed that a CN value in excess of 200 is indicative of a model that adequately represents the sample data. As shown in Table 3.5, the CN value for our hypothesized SC model was 223.31. Interpretation of this finding, then, leads us to conclude that the size of our sample (N - 265) was sufficiently large as to allow for an adequate fit to the model, presuming that it was correctly specified. Having worked your way through this smorgasbord of goodness-of-fit measures, you are likely feeling totally overwhelmed and wondering what you do with all this information! Although you certainly do not need to report the entire set of fit indices, such an array can give you a good sense of how well your model fits the sample data. But how does one choose which indices are appropriate in the assessment of model fit? Unforl:unately, this choice is not a simple one, largely because particular indices have been shown to operate somewhat differently given the sample size, estimation procedure, model complexity, violation of the underlying assumptions of multivariate normality and variable independence, or any combination thereof. Thus, Hu and Bentler (1995) cautioned that in choosing which goodness-of-fit indices to use in the assessment of model fit, careful consideration of these critical factors is essential. For further elaboration on the aforementioned goodnessof-fit statistics with respect to their formulae and functions, or the extent to which they are affected by sample size, estimation procedures, violations of assumptions, or some combination of these, readers are referred to: Bandalos, 1993; Bollen (1989b); Browne & Cudeck (1993); Curran, West, & Finch, (1996); Gerbing & Anderson (1993); Hu & Bentler (1995); Hu et al. (1992); Joreskog & Sorbom, (1993c); La Du & Tanaka (1989); Marsh et al. (1988); Mueller (1996); Mulaik et al., (1989); Sugawara & MacCallum (1993); West et al. (1995); Wheaton (1987); Williams & Holahan (1994); for an annotated bibliography, see Austin & Calderon, 1996). In finalizing this section on model assessment, I wish to emphasize that global fit indices alone cannot possibly envelop all that needs to be known
Application 1
119
about a model in order to judge the adequacy of its fit to the sample data. As Sobel and Bohrnstedt (1985) so cogently stated over a decade ago, "Scientific progress could be impeded if fit coefficients (even appropriate ones) are used as the primary criterion for judging the adequacy of a model" (p. 158). They further posited that, despite the problematic nature of the ·l statistic, exclusive reliance on goodness-of-fit indices is unacceptable. Indeed, fit indices provide no guarantee whatsoever that a model is useful. In fact, it is entirely possible for a model to fit well and yet still be incorrectly specified (Wheaton, 1987; For an excellent review of ways by which such a seemingly dichotomous event can happen, readers are referred to Bentler & Chou, 1987.) Fit indices yield information bearing only on the model's lack offit. More importantly, they can in no way reflect the extent to which the model is plausible; this judgement rests squarely on the shoulders of the researcher. Thus, assessment of model adequacy must be based on multiple criteria that take into account theoretical, statistical, and practical considerations.
Model Misspecification.
Having established the extent to which the hypothesized model fits the sample data, as reflected by the goodness-offit coefficients, our next task is to identify any areas of misfit in the model aoreskog, 1993). In this regard, LISREL yields two types of information bearing on model misspecification-the residuals, and the modification indices. We turn first to information related to the residuals; the values for our hypothesized model are shown in Table 3.6.
Residuals. Recall that the essence of SEM is to determine the fit between the restricted covariance matrix [~(e)], implied by the hypothesized model, and the sample covariance matrix (S); any discrepancy between the two is captured by the residual covariance matrix. Each element in this residual matrix, therefore, represents the discrepancy between the covariances in ~(e) and those in S [i.e., ~(e) - ~]; that is to say, there is one residual for each pair of observed variables aoreskog, 1993). In the case of our hypothesized model, for example, the residual matrix would contain [(16 x 17)/2] - 136 elements. This matrix is not automatically provided in the output file, but can be requested by specifying RS on the CU line (see chapter 4). Standard information provided in the LISREL output file are summary statistics related to the residuals reported in the form of a stem-leaf plot. Let us now turn to this information as provided in Table 3.6.
Table 3.6
LISREL Output for Hypothesized Four-factor CFA Model: Residual Statistics SUMMARY STATISTICS FOR FITTED RESIDUALS SMALLEST FITTED RESIDUAL MEDIAN FITTED RESIDUAL LARGEST FITTED RESIDUAL
= = =
-0.55 0.00 0.23
STEMLEAF PLOT - 5:5
- 5: - 4: - 4: -
3: 3:41 2:7 21111 - 1 :98 - 1:441100 - 0:9998888887777766666555555 0:44443333322221111111111111100000000000000000000 0:11112222233344 0:555556666667778888899 111 11 2234 1:5677 2:013
SUMMARY STATISTICS FOR STANDARDIZED RESIDUALS SMALLEST STANDARDIZED RESIDUAL = -4.21 MEDIAN STANDARDIZED RESIDUAL = 0.00 lARGEST STANDARDIZED RESIDUAL = 4.12 STEMLEAF PLOT - 4:2
- 3:
-
3:300 2:9 2:3330 1 :9976665 1:443221110000 0: 888888776666666655555 0:4444333222222111110000000000000000000 011112223444 0:66777778899999 1:00000122344 1:556667789
2:
2:5599 3:3 3', 411 LARGEST NEGATIVE STANDARDIZED RESIDUALS RESIDUAL FOR SDQ2N40 AND SDQ2N04 -3.04 RESIDUAL FOR SDQ2N34 AND SDQ2N28 -2.95 RESIDUAL FOR SDQ2N07 AND SDQ2N40 -2.88 RESIDUAL FOR SDQ2N07 AND SDQ2N34 -3.25 RESIDUAL FOR SDQ2N31 AND SDQ2N19 -4.21
120
LARGEST POSITIVE STANDARDIZED RESIDUALS RESIDUAL FOR SDQ2N25 AND SDQ2N01 4.12 RESIDUAL FOR SDQ2N40 AND SDQ2N37 2.89 RESIDUAL FOR SDQ2N31 AND SDQ2N07 3.26 RESIDUAL FOR SDQ2N43 AND SDQ2N19 2.87
Application 1
121
As you will note, there are two such stem-leaf displays-one applies to the fitted (or unstandardized) residuals, the other to the standardized residuals. Summary statistics for the fitted residuals indicate that the smallest fitted residual is -.55 and the largest is .23, with the median residual equal to 0.00. A more comprehensive breakdown of this information is provided in the stem-leaf plot which follows. Although stem-leaf plots are similar to frequency distributions, they have the advantage of being able to convey summary information related to individual, rather than group, values. In reading a stem-leaf plot, the values to the left of the dotted line constitute the stem, whereas those to the right of the dotted line constitute the leaves. In our case, each leaf represents one element in the residual variance-covariance matrix and, thus, there are 136 leaves in the plot. The number associated with each leaf represents the second digit of the residual value. As such, at the top of the fitted residual stemleaf plot we find the value -.55, and at the bottom, the values .20, .21, and .23, the latter being the largest positive residual. In addition, the stemleaf plot provides a visual display of the distribution of the residuals. Joreskog (1993) has noted that a well-fitted model will be characterized by standardized residuals that are symmetrically clustered around the zero point, with most being in the middle of the distribution and only a few in the tails. He further heeded that an excess of residuals, on either the positive or negative sides of the plot, is indicative of residuals being systematically underestimated or overestimated, respectively. A review of the stem-leaf plot of standardized residuals shown in Table 3.6 suggests an hypothesized model that is reasonably well-fitting. Because the fitted residuals are dependent on the unit of measurement of the observed variables, they can be difficult to interpret. Thus, LISREL 8 also provides information related to their standardized values; this summary is presented in the second half of Table 3.6. Standardized residuals are fitted residuals divided by their asymptotically Qarge sample} standard errors Goreskog & Sorbom, 1988). As such, they are analagous to Z-scores and are therefore the easier of the two sets of residual values to interpret. In essence, they represent estimates of the number of standard deviations the observed residuals are from the zero residuals that would exist if model fit were perfect (i.e., ~(e) - S - 0.0]. Values> 2.58 are considered to be large. In reading the stem-leaf display related to the standardized residuals, the stem represents the whole number in a standard deviation value. For example, going to the top of the plot, we interpret this value as -4.2 standard
122
Chapter 3
deviations; the value at the bottom of the plot represents 4.1 standard deviations. As you will note, these are the values reported for the smallest and largest standardized residuals, respectively. In examining the number of values > 2.58, we conclude that there are nine values indicative of possible misfit in the model. LISREL concludes this portion of the output with a summary and identification of the nine standardized residuals.
Modification Indices. The second block of information related to misspecification reflects the extent to which the hypothesized model is appropriately described. Evidence of misfit in this regard is captured by the modification indices (MIs), which can be conceptualized as a -l statistic with one degree of freedom aoreskog & Sorbom, 1988). Specifically, for eachfixed parameter specified, LISREL provides a MI, the value of which represents the expected drop in overall X2 value if the parameter were to be freely estimated in a subsequent run; all freely estimated parameters automatically have MI values equal to zero. Although this decrease in X2 is expected to approximate the MI value, the actual differential can be larger. Associated with each MI is an Expected (parameter) Change value (Saris, Satorra, & Sorbom, 1987) which is reported in a separate matrix following the MI values. This statistic represents the predicted estimated change, in either a positive or negative direction, for each fixed parameter in the model and yields important information regarding the sensitivity of the evaluation of fit to any reparameterization of the model. 12 The MIs and accompanying expected • parameter change statistics related to our hypothesized model are presented in Table 3.7. As shown in Table 3.7, the MIs for the factor loading matrix (Az) are presented first, followed by the expected parameter change associated with each parameter in this matrix that was specified as fixed to 0.0. The other set of MIs relate to the e a matrix. Because all parameters in the matrix were estimated, the program notes that there are "NO NON-ZERO MOOIFICATION INOICES FOR PHI." Following the presentation of information related to each fixed parameter, LISREL identifies the largest MI; in other words, it pinpoints which parameter, if set free, would lead to the largest drop in X2 value. In the present case, this parameter is TD(15,14) with
12Bender (1995) has noted, however, that because these Parameter Change statistics are sensitive to the way by which variables and factors are scaled or identified, their absolute value is sometimes difficult to interpret.
Table 3.7 LISREL Q!,!tl1!!t for Hlt12Qthesi,ed Four-factor CFA Model: Modification Indices and
Ex~cted
Change
lIDO I FI CAT1111 1110 I CES FOR LNIIDA-X GSC SOQ2NOI SOQ2Nll SOQ2N25 SOQ2Nl7 SOQ2N04 SOQ2N16 SDQ2N28 SOQ2N40 SOQ2Nl0 SDQ2N22 SDQ2N34 SDQ2N46 SDQ2N07 SDQ2N19 SOQ2N31 SOQ2N43
2.05 0.07 2.47 0.73 0.04 0.01 0.44 0.00 5.l6 0.13 7.40 0.51
ASC
ESC
MSC
2.55 0.02 1.16 6.92
0.25 0.52 1.51 0.65 0.04 0.00 2.09 1.84
1.04 0.51 0.01 2.41 0.02 0.04 1.07 0.85 0.01 0.82 7.32 0.10
O.Dl 0.87 7.32 0.05 11.21 1.58 3.4l 0.17
9.20 1.08 l.OI 0.14
EXPECTED CIIAIIGE FOR LNIIDA-X GSC SOQ2NOI SOQ2N13 soa2N25 SDQ2N37 SDQ2N04 SDQ2N16 SDQ2N28 SOQ2N40 SDQ2Nl0 SDQ2N22 SOQ2N34 SOQ2N46 SDQ2N07 soa2N19 soa2N31 SOQ2N43
0.28 -0.04 -0.28 0.15 -0.03 -0.01 0.14 0.00 -0 . 30 -0.05 0.29 -0.09
ASC
ESC
MSC
-0.34 - 0.03 -0.20 0.49
- 0.07 0.11 -0.17 0.10 0.04 0.00 - 0.23 0.22
-0.07 -0.05 0.01 0.09 0.01 -0.02 0.08 -0.08 0.00 0.04 -0.20 0.02
0.03 0.14 -0.66 0.04 -0.56 0.23 0.26 0.06
-0.30 0.11 0. 14 0. 04
lID _-ZERO IIDOIFICATlIII IIIDICES FOR PHI lIDO I FI CATJ III I 110 ICES FOR THETA-DELTA
SDQ2NOI SDQ2N13 SDQ2N25 SDQ2Nl7 SDQ2N04 SOQ2N16 SDQ2N28 SOQ2N40 SDQ2Nl0 SDQ2N22 SDQ2N34 SOQ2N46 SOQ2N07 SDQ2N19 SDQ2N31 SDQ2N43
SDQ2NOI
SDQ2N13
SDQ2N25
0.11 16.99 5.42 0.00 0.19 1.55 0.62 0 . 10 0.62 1.05 1.97 0.89 0.18 0.01 1.76
1.57 0.78 6.52 0.12 0.82 0.57 0. 04 0 . 14 2.70 0.28 2.08 0.29 0.01 1.64
2.03 0.83 0.38 0.53 0.00 0.07 0. 25 0.66 0.64 0.85 2.70 0.95 0.03
SDQ2N37
0.00 0.90 0.58 5 . 54 0 . 46 0.08 0.40
1.88
0.68 2.13 4.17 2.59
SDQ2N04
SDQ2N16
0.43 0.03 9.26 6.07 1.55 0.05 0.55 1.29 0 . 42 0.51 2.06
2 . 18 0.66 0.29 1.13 1.03 0.65 0.07 0. 27 0.24 0 . 01
123
Table 3.7 cont.
IIDOIFICATlIII IIIlICES FOR THETA-DELTA
SDQZN28 SDQ2N40 5OQZN10 5OQ2N22 SDQ2N34 SDQ2N46 5OQ2N07 5OQ2N19 5OQZN31 SDQ2N43
SDQZN28
SDQ2N40
5OQZN10
SDQ2NZZ
SDQ2N34
SDQ2N46
0.90 0.84 0.00 3.08 0.12 2.99 1.38 0.69 0.03
0.Z6 0.06 0.33 1.80 3.99 3.09 0.54 5.76
0.24 0.93 0.29 0.07 0.22 0.23 0.05
0.57 0.35 0.08 0.58 0.74 1.51
0.56 2.38 0.00 0.07 O.OZ
0.03 3.97 0.28 4.84
lIDO I FICATlIII IIIlICES FOR THETA-DELTA
5OQZN07 SOQZN19 SOQ2N31 SOQZN43
SDQZN07
SDQZN19
SOQZN31
1.48 10.65 3.74
17.75 8.25
0.31
SOQZN43
EXPECTED CHANGE FOR THETA-DELTA
SDQ2NOI SDQZN13 SDQ2N25 SDQ2N37 SDQ2N04 SDQ2N16 5OQ2N28 SDQ2N40 SDQ2Nl0 SDQ2N22 SDQ2N34 SOQ2N46 SDQ2N07 SDQZN19 SDQZN31 SDQZN43
SOQZNOI
SOQ2NB
SOQ2N25
SOQ2N37
SOQ2N04
SOQ2N16
0.03 0.36 '0.19 0.00 '0.03 '0.09 -0.06 O.OZ -0.05 0.12 0.12 0.07 0.04 -0.01 -0.10
'0.11 0.08 0.22 -0.02 '0.07 -0.06 0.01 -0.02 0.19 0.04 -0.11 0.05 -0.01 0.09
'0.11 '0.07 '0.04 '0.05 0.00 -0.02 0.03 0.09 '0.06 0.06 '0.13 0.05 '0.01
0.00 0.05 0.05 0.15 '0.04 '0.02 0.06 '0.09 '0.05 '0.10 0.10 -0.10
0.04 '0.01 ·0.Z4 0.17 '0.09 0.03 '0.06 0.09 -0.06 0.04 '0.11
0.09 -0.05 -0.03 0.06 '0.09 ·0.05 '0.02 0.03 -0.02 0.00
EXPECTED CHANGE FOR THETA-DELTA
SDQ2N28 SDQ2N40 SDQ2Nl0 SOQ2N22 SOQZN34 SDQ2N46 SDQ2N07 SDQ2N19 SDQ2N31 SDQ2N43
SDQ2N28
SDQZN40
SDQZN10
SDQ2N22
SDQ2N34
SDQZN46
0.07 -0.06 0.00 -0.18 -0.03 -0.11 0.09 0.04 -0.01
0.03 0.01 0.06 0.10 '0.14 0.13 -0.04 0.16
-0.04 0.10 -0.04 0.02 0.03 -0.02 '0.01
-0.07 0.04 -0.02 0.05 0.04 -0.07
0.09 -0.16 0.01 -0.02 0.01
-0.01 -0.17 0.03 0.16
EXPECTED CHANGE FOR THETA-DELTA
SDQ2N07 SDQ2N19 SDQ2N31 SDQ2N43
124
SDQ2N07
SDQ2N19
0.11 0.31 -0.14
-0.33 0.22
IlAXIIIII lIDO I FICATlIII IIIlEX IS
SOQ2N31
SOQ2N43
-0.04 17_75 FOR ELEllENT (15,14) OF THETA-DELTA
Application 1
125
an MI value of 17.75. As determined from the 06 matrix, it represents an error covariance between SDQ2N19 and SDQ2N31 and has an expected value (if it were to be freely estimated) of -.33.
POST HOC ANALYSES At this point, the researcher can decide whether or not to respecify and reestimate the model. If he or she elects to follow this route, it is important to realize that analyses are now framed within an exploratory, rather than a confirmatory mode. In other words, once an hypothesized CF A model, for example, has been rejected, this spells the end of the confirmatory factor analytic approach, in its truest sense. Although CF A procedures continue to be used in any respecification and reestimation of the model, these analyses are exploratory in the sense that they focus on the detection of misfitting parameters in the originally hypothesized model. Such post hoc analyses are conventionally termed specification searches (see MacCallum, 1986; the issue of post hoc model fitting is addressed further in chapter 4 in the section dealing with cross-validation). The ultimate decision that underscores whether or not to proceed with a specification search is twofold. First and foremost, the researcher must determine whether the estimation of the targeted parameter is substantively meaningful. If, indeed, it makes no sound substantive sense to free up the parameter exhibiting the largest MI, then one may wish consider the parameter having the next largest MI value aoreskog, 1993). Second, one needs to consider whether or not the respecified model would lead to an overfitted model. The issue here is tied to the idea of knowing when to stop fitting the model, or as Wheaton (1987) phrased the problem, "knowing ... how much fit is enough without being too much fit" (p. 123). In general, overfitting a model involves the specification of additional parameters in the model after having determined a criterion that reflects a minimally adequate fit. For example, an overfitted model can result from the inclusion of additional parameters that: (a) are "fragile" in the sense of representing weak effects that are not likely replicable, (b) lead to a significant inflation of standard errors, and (c) influence primary parameters in the model, albeit their own substantive meaningfulness is somewhat equivocal. Although correlated errors often
126
Chapter 3
fall into this latter category, 13 there are many situations- particularly with respect to social psychological research, where these parameters can make strong substantive sense and therefore should be included in the model aoreskog & Sorbom, 1993c). Returning again to Table 3.7, notice that the second largest MI (11.21) represents the cross-loading of SDQ2N07, designed to measure math selfconcept on academic self-concept; its expected parameter change statistic is -.56. Although one could argue for viability of both this MI and the largest one reviewed earlier, the negative estimated parameter change statistic makes little sense. Taken together, this information, and the CFI value of .96 for overall model fit, leads me to conclude that any further incorporation of parameters into the model would result in an overfitted model. Indeed, MacCallum et al. (1992, p. 501) have cautioned that, "when an initial model fits well, it is probably unwise to modify it to achieve even better fit because modifications may simply be fitting small idiosyncratic characteristics of the sample" (p. 501). Adhering to this caveat, I concluded that the four-factor model schematically portrayed in Fig. 3.1 represents an adequate description of self-concept structure for Grade-7 adolescents.
Hypothesis II: Self-Concept Is a Two-Factor Structure The model to be tested here postulates a priori that SC is a two-factor structure consisting of GSC and ASC. As such, it argues against the viability of subject-specific academic SC factors. As with the four-factor model, the four GSC measures load onto the GSC factor; in contrast, all other measures load onto the ASC factor. This hypothesized model is represented schematically in Fig. 3.2. In specifying this 2-factor model of SC, only the section of the input file beginning with the MO line required modification; all other information remained the same. Thus, only this portion of the input file is presented in Table 3.8.
13Typically, the misuse in this instance arises from the incorporation of correlated errors into the model purely on the basis of statistical fit and for the purpose of achieving a better fitting model.
8,
SDQ2NOl
8,
SDQ2N13
8,
SDQ2N2S
8,
SDQN37
0, 8,
SDQ2N04 x,
8,
SDQ2N l 6
o
SDQ2NZ8 SDQ2N28
.. ..
X,
~
8
"
"
GSC ~,
"
" "
...
" SDQ2N40 ... X,
SDQ2NtO
...
8,
SDQ2N22
8,
SDQ2N34
8,
SDQ2N46
8,
SDQ2N07
8.
SDQ2N19
'.
SDQZN31
'.
SDQ2N43
'.
'.
ASC
!;, ~2
'" '"
'. XII
'.
FIG. 3.2. Two-factor Model of Self-concept.
127
128
Chapter 3
Table 3.8 Selected LISREL Input for Hypothesized Two-factor CFA Model MO NX=16 NK=2 LX=FU,FI PH=SY,FR TD=SY LK
GSC ASC FR LX(2,1) LX(3,1) LX(4,1) FR LX(6,2) LX(7,2) LX(8,2) LX(9,2) LX(10,2) LX(11,2) LX(12,2) FR LX(13,2) LX(14,2) LX(15,2) LX(16,2) VA 1.0 LX(l,l) LX(5,2) OU
In reviewing these specifications, three points relative to the modification of the input file are of interest. First, consistent with the specification of a two-factor model, the MO line has been altered to signify two rather than four ~s; accordingly, the labels representing ESC and MSC have been deleted. Second, although the pattern of factor loadings remains the same for the GSC and ASC measures, it changes for both the ESC and MSC measures in allowing them to load onto the ASC factor. As such, their LISREL specification as LX parameters must be altered accordingly; their new designations range from LX(9,2) through LX(16,2). Finally, because only one of these eight factor loadings needs to be fixed to 1.0, the two previously constrained parameters [LX(9,3); LX(13,4)] are now freely estimated. Only the goodness-of-fit statistics are relevant to the present application, and these are presented in Table 3.9. As indicated in the output, the X2(I03) value of 455.93 represents a poor fit to the data, and a substantial decrement from the overall fit of the four-factor model (LlX 2(s) = 297.42). The gain of five degrees of freedom can be explained by the estimation of two less factor variances and five less factor covariances, albeit the estimation of two additional factor loadings-formerly LX(9,3) and LX(13,4). As expected, all other indices of fit reflect the fact that self-concept structure is not well represented by the hypothesized two-factor model. In particular, the CFI value of .78 is strongly indicative of inferior goodness-offit between the hypothesized two-factor model and the sample data.
Hypothesis III: Self-Concept Is a One-factor Structure Although it now seems obvious that the structure of SC for Grade-7 adolescents is best represented by a multidimensional model, there are still researchers who contend that SC is a unidimensional construct. Thus, for
129
Application 1
purposes of completeness, and to address the issue of unidimensionality, Byrne and Worth Gavin (1996) proceeded in testing this hypothesis. However, because the one-factor model represents a restricted version of the two-factor model, and thus cannot possibly represent a better fitting model, in the interest of space, these analyses are not presented here. In summary, it is evident from these analyses that both the two-factor and one-factor models of self-concept represent a misspecification of factorial structure for early adolescents. Based on these findings, therefore, Byrne and Worth Gavin (1996) concluded that SC is a multidimensional construct, which in their study comprised the four facets of general, academic, English, and mathematics self-concepts.
Table 3.9 Selected L1SREl OutPUt for Hypothesized Two-factor CFA Model GlXDNESS OF FIT STATISTICS
=
=
CHI-SQUARE WITH 103 DEGREES OF FREEDOM 455_93 (P 0.0) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 352.93 MINIMUM FIT FUNCTION VALUE = 1. 73 POPULATION DISCREPANCY FUNCTION VALUE (FO) = 1.34 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.11 P-VALUE FOR TEST OF CLOSE FIT (RMSEA < 0.05) = 0.00000037 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 1.98 ECVI FOR SATURATED MODEL = 1.03 ECVI FOR INDEPENDENCE MODEL = 6.55 CHI-SQUARE FO~
INDEPENDENCE MODEL WITH 120 DEGREES OF FREEDOM = 1696.73 INDEPENDENCE AIC = 1728.73 MODel AIC = 521.93 SATURATED AIC = 272.00 INDEPENDENCE CAlC = 1802.00 MODEL CAlC = 673.06 SATURATED CAlC = 894.84 ROOT MEAN SQUARE RESIDUAL (RMR) = 0.18 STANDARDIZED RMR = 0.10 GOODNESS OF FIT INDEX (GFI) = 0.75 ADJUSTED GOODNESS OF FIT INDEX (AGF I) = 0.68 PARSIMONY GOODNESS OF FIT INDEX (PGFI) = 0.57 NORMED FIT INDEX (NFl) = 0.73 NON-NORMED FIT INDEX (NNFI) = 0.74 PARSIMONY NORMED FIT INDEX (PNFI) = 0.63 COMPARATIVE FIT INDEX (CFI) = 0.78 INCREMENTAL FIT INDEX (IFI) = 0.78 RELATIVE FIT INDEX (RFI) = 0.69 CRITICAL N (CN)
= 81.66
CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P-VALUE FOR CHI-SQUARE
Chapter 3
130
FROM THE SIMPLIS PERSPECTIVE Having thoroughly reviewed the LISREL approach to the testing of these three hypotheses, let us turn our attention now to the SIMPLIS input and output files. For reasons of space, only files related to the initial four-factor model are included here. Because it provided me with the opportunity to show you another important feature of the PRELIS program, I have chosen not to base the SIMPLIS analyses on the full data set, as was the case for the LISREL analyses, but rather, to use a second data file comprising only the 16 variables of interest. Creation of this new data file is performed by PRELIS.
Preliminary Analyses Using PRELlS: Creating a Raw Data File In working with large data sets, researchers often wish to create smaller subsets of data for a diversity of reasons (e.g., less cumbersome, space-saving). PRELIS can be used very easily to carry out this task simply by making the request on the OU line of the file. Because this application represents our first PRELIS file, I consider it important to take this opportunity to review its various components in detail. This PRELIS input file is shown in Table 3.10.
Table 3.10 PRELIS Input for Creating Raw Data Subset ! creation of Smaller Data Set for Use with SIMPLIS "ASC7SIMP" ! 1st-order 4-Factor CFA Model DA NI=46 NO=265
LA
SPPCN08 SPPCN18 SPPCN28 SPPCN38 SPPCN48 SPPCN58 SPPCNOl SPPCN21 SPPCN31 SPPCN41 SPPCN51 SPPCN06 SPPCN16 SPPCN26 SPPCN46 SPPCN56 SPPCN03 SPPCN13 SPPCN23 SPPCN33 SPPCN43 SDQ2N01 SDQ2N13 SDQ2N25 SDQ2N37 SDQ2N04 SDQ2N16 SDQ2N28 SDQ2N10 SDQ2N22 SDQ2N34 SDQ2N46 SDQ2N07 SDQ2N19 SDQ2N31 MASTENG1 MASTMATl TENG1 TMATl SENG1 SMAT1 RA=ASC7INDM.DAT FO (40F1.0,X,6F2.0) SD SPPCN08-SPPCN53 MASTENG1-SMATl !Only Variables 25-40 Analysis CO ALL OU RA=ASC7SIMP.DAT WI=l ND=O
SPPCNll SPPCN36 SPPCN53 SDQ2N40 SDQ2N43
in
Application 1
131
At first blush, this input file looks very similar to the LISREL file presented in Table 3.2. However, there are four important differences between the two programs. First, with PRELIS the entry of an asterisk between the LA keyword and the first variable label is not an option. In fact, if you were to include this symbol, the program interprets it as a variable label; as a result, the last variable to be included in the analyses is excluded. Second, whereas in LISREL the variables to be included in the analyses (if they are a subset of the total number of variables) are identified on the SE line, in PRELIS, the variables to be excluded are specified on the SD (select and delete) line. Thus, in reviewing the SD line shown in Table 3.12, we see that the variables preceding SDQ2NOI (Variable 25) and following SDQ2N43 (Variable 40) are to be deleted, thereby leaving the analyses to be based on the 16 variables of interest. Third, the specification of "CO ALL" tells the program that all 16 variables are to be treated as continuous variables. Finally, by stating "RA-ASC7SIMP.DAT" on the OU line the program is advised that we wish to create a raw data set with the file name "ASC7SIMP.DAT". In addition, the program is told that each variable in the new data set is to occupy one column only (WI-I) and that there are no decimals (ND .. 0). Once this input file is executed, both the newly created data set and the output file for this PRELIS job will be produced. The first 10 lines, along with the last line of the new data set are shown in Table 3.11. You will note that the FORTRAN statement precedes the data. This line was added manually after the data file was created because, as
Table 3.11 Raw Data Subset Created by PRELIS (16F1.0) 6546344626156666 6666666656666666 4662646365456631 5556565656356665 6554344446563445 5553324444564544 1616466616165665 2164444666153446 5566666656566666 4636656646465666
4646656646466666
132
Chapter 3
Table 3.12 SIMPLIS Input for Hypothesized Four-factor eFA Model ! Testing for the Multidimensionality of se for Grade 7 "ASC7F4I" ! 1st-order 4-Factor eFA Model ! Initial Model
OBSERVBD VARIABLBS:
SDQ2N01 SDQ2N13 SDQ2N25 SDQ2N37 SDQ2N04 SDQ2N16 SDQ2N28 SDQ2N40 SDQ2NIO SDQ2N22 SDQ2N34 SDQ2N46 SDQ2N07 SDQ2N19 SDQ2N31 SDQ2N43 RAW DATA from FILE ASC7SIMP.DAT SAMPLB SIZB: 265
LATBNT VARIABLBS:
GSC ASe ESC MSe
EQUATIONS:
SDQ2NOl = l*GSe SDQ2N13 = GSC SDQ2N25 = GSC SDQ2N37 = GSC SDQ2N04 = l*ASe SDQ2N16 = ASe ASe SDQ2N28 = ASe SDQ2N40 = l*ESC SDQ2NIO SDQ2N22 ESC SDQ2N34 ESC SDQ2N46 ESC SDQ2N07 l*MSC MSC SDQ2N19 SDQ2N31 MSC SDQ2N43 MSC
BND 01' PROBLBM
noted in chapter 2, format statements must be included in the data file, rather than in the input file, when using the SIMPLIS language.
The SIMPLIS Input File The SIMPLIS input file for the initially hypothesized four-factor model follows the same pattern as the three example files shown in chapter 2, with one exception-the sample size was placed on the same line as the command word instead of on the line following it; either setup is acceptable. This file is presented in Table 3.12. In reviewing Table 3.12, there are four points worthy of mention. First, consistent with the newly created subset of data, only 16 variable labels are listed. Second, note that the raw data are specified as residing in the newly created external file ASC7SIMP.DAT (compare with Table 3.10). Third, consonant with the LISREL input file, the first of each set of congeneric variables is specified as a fixed parameter having a nonzero value of 1.00.
Application
1
133
Finally, in its present form, this input file requests that only the standard SIMPLIS output be provided. However, the user can request the LISREL output file by simply adding a line immediately before the END OF PROBLEM with the command LISREL OUTPUT. Two-letter keywords representing particular output information can be added after this command; in the absence of these selected keywords, only the standard LISREL output will be provided.
The SIMPLIS Output File Because the SIMPLIS output file basically differs from the LI5REL file only in the way information is conveyed, this file is not be printed in each of the subsequent chapters in this book. It is presented here only for didactic purposes in showing you what to expect in terms of the presentation of Table 3.13 Selected SIMPLIS output for Hypothesized Four-factor CFA Model
LISREL ESTIMATES (MAXIMUM LIKELIHOOD) SDQ2N01 = 1.00*GSC, Errorvar.= 1.20 , R2 (0.13) 9.52
0.34
SDQ2N13 = 1.08*GSC, Errorvar.= 1.12 , R2 (0.15) (0.12) 7.03 9.00
0.39
SDQ2N25 = 0.85*GSC, Errorvar.= 1.06 , R2 = 0.30 (O.13) (0.11) 6.44 9.88 SDQ2N37 = 0.93*GSC, Errorvar.= 0.77 (0.13) (0.088) 7.12 8.80
R2
0.41
THE MODIPICATION INDICES SUGGEST TO ADD THE PATH TO PROM DECREASE IN CHI-SQUARE NEW ESTIMATE
SDQ2N07 SDQ2N07
ASC ESC
11. 2 9.2
-0.56 -0.30
THE MODIPICATION INDICES SUGGEST TO ADD AN ERROR COVARIANCE BETWEEN AND DECREASE IN CHI-SQUARE NEW ESTIMATE SDQ2N25 SDQ2N40 SDQ2N31 SDQ2N31 SDQ2N43
SDQ2N01 SDQ2N04 SDQ2N07 SDQ2N19 SDQ2N19
17.0 9.3 10.7 17.8 8.3
0.36 -0.24 0.31 -0.33 0.22
134
Chapter 3
results. The standard output file for the initially hypothesized model differs from the LISREL output in only two ways: the presentation of (a) the estimates and R2 values, and (b) the modification indices and expected parameter change statistics. Shown in Table 3.13 are the factor loading parameter estimates for general SC only, along with the misspecification information. The first thing you will likely note here is that the information is not presented in matrix form couched in Greek symbol names. Rather, all information related to the estimation of each observed variable is provided together. Likewise, information bearing on possible model misspecification is more compact and explanatory. For example, recall that the largest modification index was 17.75 and related to the error covariance between SDQ2N31 and SDQ2N19 (see Table 3.7). As you can see in Table 3.13, the SIMPLIS output file identifies this specification error by actually naming the two variables involved. In lieu of actually stating the value of the MI, it specifies the amount of decrease in -l that will occur should the parameter be freely estimated; likewise, the expected parameter change statistic is expressed as the new estimated value (that should occur in light of such respecification).
APPLICATIONS RELATED TO OTHER DISCIPLINES Business: Nelson, J. E., Duncan, C. P., & Kiecker, P. L. (1993). Toward an understanding of the distraction construct in marketing. Journal ofBusiness Research, 26, 201-221. Medicine: Lobel, M. & Dunkel-Schetter, C. (1990). Conceptualizing stress to study effects on health: Environmental, perceptual, and emotional components. Anxiety Research, 3, 213-230. Sociology: Shu, X., Fan, P. L., Li, X., & Marini, M. M. (1996). Characterizing occupations with data from the Dictionary of occupational titles. Social Science Research, 25, 149-173.
CHAPTER
4
Application 2: Testing the Factorial Validity of Scores From a Measuring Instrument (First-Order CFA Model) For our second application, we once again examine a first-order CF A model. However, this time we test hypotheses bearing on a single measuring instrument, the Maslach Burnout Inventory (MBI; Maslach & Jackson, 1981, 1986), designed to measure three dimensions of burnout that the authors termed Emotional Exhaustion (EE), Depersonalization (DP), and Reduced Personal Accomplishment (PA). The term burnout denotes the inability to function effectively in one's job as a consequence of prolonged and extensive job-related stress; emotional exhaustion represents feelings of fatigue that develop as one's energies become drained; depersonalization is the development of negative and uncaring attitudes towards others; and reduced personal accomplishment is a deterioration of self-confidence and a dissatisfaction in one's achievements. Indeed, the escalation and prevalence of burnout among members of the teaching profession over this past decade has been of great concern to both educational administrators and clinicians. Largely, as a consequence of the international scope of the burnout problem, there is now a plethora of research on the topic (for a review of this international literature, see Byrne, in press). The purposes of the original study from which this example is taken (Byrne, 1993), were fourfold: (a) to test for the factorial validity of the MBI separately for elementary, intermediate, and secondary teachers; (b) given findings of inadequate fit, to propose and test an alternative factorial struc135
Chapter 4
136
ture; (C) to cross-validate this structure across a second independent sample for each teacher group; and (d) to test for the equivalence of item measurements and theoretical structure across the three teaching panels. Only analyses bearing on the factorial validity of the MBI for the calibration sample of elementary teachers are of interest in the present chapter. Confirmatory factor analysis of a measuring instrument is most appropriately applied to measures that have been fully developed and their factor structures validated. The legitimacy of CF A use, of course, is tied to its conceptual rationale as a hypothesis-testing approach to data analysis. That is to say, based on theory, empirical research, or a combination of both, the researcher postulates a model and then tests for its validity given the sample data. Thus, application of CF A procedures to assessment instruments that are still in the initial stages of development represents a serious misuse of this analytic strategy. In testing for the validity of factorial structure for an assessment measure, the researcher seeks to determine the extent to which items designed to measure a particular factor (i.e., latent construct) actually do so. In general, subscales of a measuring instrument are considered to represent the factors; all items comprising a particular subscale are therefore expected to load onto its related factor. Given that the MBI has been commercially marketed since 1981, is the most widely used measure of occupational burnout, and has undergone substantial testing of its psychometric properties over the years (see, e.g., Byrne 1991, 1993, 1994b), it most certainly qualifies as a candidate for CFA research. Interestingly, until my 1991 study of the MBI, virtually all previous factor analytic work had been based only on exploratory procedures. We turn now to a description of the structure of this instrument.
THE MEASURING INSTRUMENT UNDERSTUDY The MBI is a 22-item instrument structured on a 7-point Likert-type scale that ranges from O,jeeling has never been experienced, to 6,jeeling experienced daily. It is composed of three subscales, each measuring one facet of burnout; the EE subscale comprises nine items, the DP subscale five, and the P A subscale eight. The original version of the MBI (Maslach & Jackson, 1981) was constructed from data based on samples of workers from a wide range
Application 2
137
of human service organizations. Recently, however, Maslach and Jackson (1986), in collaboration with Schwab, developed the Educators' Survey (MBI Form Ed), a version of the instrument specifically designed for use with teachers. The MBI Form Ed parallels the original version of the MBI except for the modified wording of certain items to make them more appropriate to a teacher's work environment. Specifically, the generic term "recipients," used in reference to clients was replaced by the term "students.".
THE HYPOTHESIZED MODEL The CF A model in the present example hypothesized a priori that: (a) responses to the MBI could be explained by three factors; (b) each item would have a nonzero loading on the burnout factor it was designed to measure, and zero loadings on all other factors; (c) the three factors would be correlated, and (d) measurement error terms would be uncorrelated. A schematic representation of this model is presented in Fig. 4.1.
Preliminary Analyses Using PRELIS For didactic purposes, and for consistency with the original study (Byrne, 1993), we are going to treat the variables as if they were continuous. Although, as noted earlier, Joreskog & Sorbom (1993b) would recommend that these 6-point MBI items be analyzed using estimation procedures appropriate to the ordinality of their structure, I wish to delay your introduction to this more complicated procedure until the next chapter. I consider it important that you first become familiar with the more basic matrices and estimation procedures. Expanding on the discussion in chapter 3 regarding the treatment of ordinal variables as if they were continuous, two points need to be made in support of this strategy. First, maximum likelihood estimation is less problematic when the covariance rather than the correlation matrix is analyzed; analysis of the latter can yield absolutely incorrect estimates aoreskog & S6rbom, 1996c). Second, when the number of categories is large, the failure to address the ordinality of the data is likely negligible (Atkinson, 1988; Babakus et al., 1987; Muthen & Kaplan, 1985). Indeed, Bentler & Chou
Chapter 4
Fig. 4.1
Emotional Exhaustion ~I
cf>21
cf>31
Depersonalization
~2
cf>32
Personal Accomplishment
~3
XI ITEM 1
01
X2 ITEM 2
&
X3 ITEM 3
&
X4 ITEM 6
&
Xs ITEM 8
0,
X6 ITEM 13
&.
X7 ITEM 14
&
Xs ITEM 16
0.
X.
ITEM 20
0.
XIO ITEM 5
0..
X11 ITEM 10
011
XI2 ITEM 11
012
Xu ITEM 15
0"
XI4 ITEM22
014
XIS ITEM 4
0"
XI6 ITEM 7
016
XI7 ITEM 9
0"
XIS ITEM 12
018
XI9 ITEM 17
0..
X20 ITEM 18
&0
X21 ITEM 19
&1
X22 ITEM 21
On
FIG. 4.1 . Hypothesized Model of Factorial Structure for the Maslach Bu rnout Inventory (22 items).
Application 2
139
(1987) have argued that, given normally distributed categorical variables, "continuous methods can be used with little worry when a variable has four or more categories" (p. 88). For our purposes here, then, we consider the MBI items as continuous variables and base the analyses on the covariance matrix. We use PRELIS to structure this matrix from raw data; the related input file is shown in Table 4.1. In reviewing the PRELIS input file, we see that data description on the DA line indicates that there are 22 variables and 560 cases. In addition, the program is advised that all variables coded as "9" are to be treated as missing values (MI - 9). Given that listwise deletion of missing data is default in LISREL 8, this means that any case having at least one missing value will be deleted from the analysis. The RA line indicates that the data are in the form of a raw data matrix which is contained in a file labeled "CVEL1.DAT," and the fixed format (22F1.0) denotes that each of the 22 variables occupies one column. The CO =ALL indicates that all variables are to be treated as continuous variables. Finally, the OU line indicates that the covariance matrix (CM) is to be saved as an external file labeled "CVEL1.CMM." Now that we have used PRELIS to create the covariance matrix, let us turn to the LISREL input file shown in Table 4.2.
LlSREL Input File As noted in the title, the input file shown in Table 4.2 represents the initially hypothesized model of the MBI as portrayed in Fig. 4.1. Consistent with the previous PRELIS file, the DA line indicates that the data comprise 22
Table 4.1 PRELIS Input for Creation of Covariance Matrix ! Testing for the Validity of the MBI "CVEL1CM" ! Elementary Teachers - Calibration Sample ! Creation of Covariance Matrix File
OA NI=22 NO=580 MI=9
LA
ITEMl ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM7 ITEM8 ITEM9 ITEM10 ITEMll ITEM12 ITEM13 ITEM14 ITEM15 ITEM16 ITEM17 ITEM18 ITEM19 ITEM20 ITEM21 ITEM22 RA=CVELl . OAT FO (22Fl. 0) CO ALL OU CM=CVEL1.CMM
140
Chapter 4
Table 4.2 LISREL Input for Initially Hypothesized Model of MBI structure ! Testing for the Validity of the MBI "CVEL1I" ! Elementary Teachers - Calibration Sample ! Initially Hypothesized Model DA NI=22 NO=580 MA=CM
LA
ITEM1 ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM7 ITEM8 ITEM9 ITEM10 ITEM11 ITEM12 ITEM13 ITEM14 ITEM15 ITEM16 ITEM17 ITEM18 ITEM19 ITEM20 ITEM21 ITEM22 CM=CVEL1.CMM
SE
1 2 3 6 8 13 14 16 20 5 10 11 15 22 4 7 9 12 17 18 19 21 MO NX=22 NK=3 LX=FU,FI PH=SY,FR TO=SY
LX
EE FR FR FR FR
OP PA LX(2,1) LX(3,1) LX(4,1) LX(5,1) LX(6,1) LX(7,1) LX(8,1) LX(9,1) LX(1l,2) LX(12,2) LX(13,2) LX(14,2) LX(16,3) LX(17,3) LX(18,3) LX(19,3) LX(20,3) LX(21,3) LX(22,3) VA 1.0 LX(l,l) LX(10,2) LX(15,3) OU RS
variables for each of 580 individuals (no cases were deleted as a consequence of missing data). In addition, this line specifies that analyses are to be based on the covariance matrix (MA-CM). The CM line signifies that the data are in the form of a covariance matrix that resides in an external file labelled "CVEL1.CMM"; this file, of course, is the one that was created using PRELIS. Next, we see that the SE keyword has been included, followed by numbers that reflect the position of each MBI item as it was entered into the data set. The SE command indicates that the data are to be read in an order that differs from the original structure. For example, it specifies that scores for Items 1, 2, and 3 are to be read first, followed by those for Items 6, 8, 13, and so on. In checking Fig. 4.1, you will see that this order is consistent with the hypothesized loading of items onto their respective factors. The MO line specifies that the postulated model comprises 22 observed variables and three latent constructs (EE, OP, PA), the factor loading matrix is full and fixed (LX - FU ,F!), the factor correlation matrix is symmetric and free (PH = SY,FR) thereby signifying that the factors are correlated, and the error variance-covariance matrix is symmetric with diagonal elements free and off-diagonal elements fixed to zero (TO = SY); for further amplification of the TO specification, see chapter 3. Following the naming of the latent constructs, as specified by the LK keyword, the next eight lines tell the
Application 2
141
program which parameters are to be freely estimated. The VA line specifies which parameters are to be constrained for purposes of statistical identification and scaling of the unobserved latent constructs; as with the other examples in this book, these constrained parameters have been given a value of 1.00. 1 Finally, the RS specified on the OU line indicates that we wish to have expanded information related to the residuals printed ih the output file. More specifically, it requests that both the fitted and standardized matrices be printed, along with something called a Q-plot. By default, as was demonstrated in chapter 3, the LISREL 8 program automatically yields the modification indices (MIs), estimated parameter change values, t values, and standard errors; earlier versions of the program required that this information be separately requested on the OU line.
LlSREL Output File Now, let us turn our attention to the output file related to the initially hypothesized model ofMBI structure. In contrast to chapter 3, however, only selected portions of this file are reviewed and discussed. We first examine indices of fit for the model as a whole, and then review the modification indices with a view to pinpointing areas of model misspecification.
Model Assessment Goodness-of-Fit summary. Because the various indices of model fit provided by the LISREL 8 program were discussed in chapter 3, model evaluation throughout the remaining chapters are limited to those summarized in Table 4.3. These criteria were chosen on the basis of (a) their variant approaches to the assessment of model fit (see Hoyle, 1995a), and (b) their support in the literature as important indices of fit that should be reported. 2 This selection, of course, in no way implies that the remaining criteria are unimportant. Rather, it addresses the need for users to select a subset of lAs was the case in chapter 3, the first of each congeneric set of items was constrained. 2For example, Sugawara and MacCallum (1993) have recommended that the RMSEA always be reported when maximum likelihood estimation is the only method used because it has been found to yield consistent results across estimation procedures when the model is well defined; MacCallum et al. (1996) have recently extended this caveat to include confidence intervals.
142
Chapter 4
Table 4.3 selected L1SREL Output: Goodness-ot-tit Statistics GOOONESS OF FIT STATISTICS CHI-SQUARE WITH 206 DEGREES OF FREEDCfI = 1010.S9 (P ESTIMATED NON-CENTRALITY PARAMETER (NCP) = S04.S9
=
0.0)
ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.OS2 EXPECTED CROSS-VALIDATION INDEX (ECVI)
= 1.91
STANDARDIZED RMR = 0.071 GOODNESS OF FIT INDEX (GFI) = 0.86 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = O.S~ CCflPARATIVE FIT INDEX (CFI)
= O.Ss
CONFIDENCE LIMITS COULD NOT BE CCflPUTED DUE TO TOO SMALL P'VALUE FOR CHI -SQUARE
goodness-of-fit indices from the generous quantity provided in the LISREL 8 program. 3 These selected indices of fit are presented in Table 4.3. It is important to note, however, that should the user wish to save the entire set of goodness-of-fit indices in a separate file, this is made possible in LISREL 8 by the inclusion of the keyword GF on the OU line, along with the desired filename (e.g., GF -MBISTATS.GFI, where MBISTATS.GFI represents the filename). In reviewing these criteria in terms of their optimal values (see chapter 3), we can see that they are consistent in their reflection of an ill-fitting model. Indeed, given the extremely small probability value associated with the X2 statistic, the confidence limits could not be computed.4 Thus, it is apparent that some modification in specification is needed in order to determine a model that better represents the sample data. To assist us in pinpointing possible areas of misfit, we examine both the residuals and the modification indices. 5 We turn next to these indicators of model misspecification. 3Several fit indices provide basically the same information. For example, when the RNI ranges from 0.0 to 1.0, it is algebraically equivalent to the CFI (Gerbing & Anderson, 1993; Goffin, 1993), and the AlC, CAlC, and ECVI serve the same function in addressing parsimony; as noted in chapter 3, Bentler (1990) has recommended that the CFI be the index of choice rather than the NFl and NNFI. 4-fhe inability to compute the CIs appears to be an anomaly specific to the LISREL 8 program. 50f course, as noted in chapter 3, it is important to realize that once we have determined that the hypothesized model represents a poor fit to the data (t.e., the null hypothesis has been rejected), we cease to operate in a confirmatory mode of analysis. All model specification and estimation henceforth represent exploratory analyses.
143
Application 2
Residuals Recall from chapters 1 and 3 that the basic idea in assessing model fit is to determine the degree of similarity between the sample covariance matrix (designated as S), and the restricted (or predicted) covariance matrix (i.e., one on which the structure specified in the hypothesized model has been im" Discrepancy in fit between these matrices is repreposed, designated as L). sented by the residual covariance matrix (S - and its standardized version, which takes standard deviations on the measured variables into account. Each element in these matrices represents the residual for a pair of observed variables. As such, then, each residual expresses the discrepancy between the observed and fitted covariance values; the standardized residual is this value divided by its estimated standard error Goreskog & Sorbom , 1993a). Because standardized residuals are independent of the units of measurement for each variable, they provide a standard metric for judging the size of the residual and are therefore easier than the unstandardized residuals to interpret. Selected information from this portion of the output file is presented in Table 4.4. In addition to the fitted and standardized residual matrices (not included here for reasons of space) which we requested by specifying RS on the OU line, LISREL lists the largest negative and positive standardized residuals. Given that a model describes the data well, these values should be small and evenly distributed; large residuals associated with particular parameters indicate their misspecification in the model, thereby affecting the overall model misfit. Only 11 of the 64 residuals listed in the output are shown in Table 4.4; this block includes the three that are exceptionally large. In reviewing these three standardized residuals, we can see that the lion's share of misfit in the model appears to lie with the covariances between Items 1 and 2, Items 6 and 16, and Items 10 and 11. The Q-plot shown in the lower portion of the table presents a joint summary of all standardized residuals bearing on the model. In essence, it represents a plot of the standardized residuals against normal quantiles. For computational reasons, however, the plot is turned so that these residuals appear along the horizonal axis Goreskog & Sorbom , 1989). Each X in the plot signifies a single point; each * represents multiple points. Residuals that follow the dotted line rising at a 45-degree angle are indicative of a well-fitting model. Deviations from this pattern suggest that either (a) the model is in some way misspecified, (b) the data are nonnormally distributed, (c) relations among particular variable pairs are of a nonlinear nature, or some combination thereof Goreskog & Sorbom, 1993a). Furthermore, standardized residuals that ap-
t)
Table 4.4 Selected llSREl Output: The Residuals LARGEST POSITIVE STAII)ARDIZED RESiDUAlS RESiDUAl FOR
RESIDUAL FOR RESIDUAL FOR
RESIDUAL FOR
RESIDUAL RESIDUAL RESIDUAL RESIDUAL RESIDUAL RESIDUAL
FOR FOR FOR FOR FOR FOR
RESiDUAl FOR
1TEM2 ITEM14 ITEM14 ITEII16 ITEM2D ITEM20 ITEMS ITEMS ITEM10 ITEM11 ITEII"
All)
AND AND
All)
AND AND AND AND AND AND
All)
ITEII1 10.15 ITEM2 3.74 ITEM13 3.69 1TEJ16 13.42 ITEM8 3.77 ITEM13 4.00 5 . 69 ITEM6 ITEM16 5.75 ITEM16 2.76 ITEM13 2.60 ITEII10 8.22
QPLOT OF STAII)ARDIZED RESIDUALS 3.5
144
·3.5 -3.5
3
145
Application 2
pear as outliers in the Q-plot reflect specification error in -the model aoreskog, 1993). Examination of the Q-plot for our MBI data reveals a clear deviation from the dotted line, thereby providing further evidence that certain parameters in the model are misspecified. For a more specific examination of misfit, we turn our attention now to the MIs that are presented in part in Table 4.5.
Table 4.5 Selected klSREL OUtll!!t: Modification Indices IIOOIFlCATlCII nlUCES Fill lNII)A-X EE ITEMl ITEM2 ITEM3 ITEM6 ITEM8 ITEM13 ITEM14 ITEII16 ITEM20 ITEM5 ITEM10 ITEMll ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEII12 ITEM 17 ITEM18 ITEM19 ITEM21
3.31 12.95 0.15 2.85 4.46 8.08 4.08 0.61 81.1a 2.05 5.46 13.36 0.00
DP
PA
10.08 17.73 0.28 13.16 1.18 10.11 7.89 25.61 4.72
17.99 14.86 4.33 6.34 2.58 5.09 22.91 4.96 5.20 5.96 0.22 6.88 4.42 0.27
0.31 0.31 3.00 12.19 5.09 2.05 4.75 0.08
IIOOIFlCATlCII llIllCES Fill THETA-DELTA
ITEMl 1TE112 ITEM3 ITEM6 ITEM8 ITEM13 ITEM14 ITEII16 ITEM20 ITEMS ITEM10 ITEMll ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEM12 ITEM17 ITEM18 ITEM19 ITEM21
ITElll
ITEM2
ITEM3
1TE116
ITEMS
ITEM13
102.98 3.44 8.64 0.08 1.93 1.14 16.02 15.07 1.96 0.43 0.80 0.02 3.18 1.13 3.89 0.07 0.62 0.31 0.73 0.67 1.12
1.07 7.98 0.20 6.96 13.97 17.60 17.74 4.17 0.85 0.77 0.73 1.15 0.21 3.75 1.92 0.64 0.62 5.17 0.01 0.04
0.22 1.34 3.74 0.04 7.11 1.83 5.03 6.48 0.14 0.13 0.04 1.56 0.20 0.48 28.15 5.32 0.18 0.85 0.05
4.74 0.91 7.09 179.98 1.98 23.72 0.32 3.22 4.35 0.33 2.08 0.88 6.48 0.01 0.90 0.01 5.06 2.08
2.03 4.22 0.54 14.22 5.54 2.75 3.75 0.19 5.34 0.10 0.04 0.86 10.04 1.46 0.27 9.07 0.05
13.64 0.49 16.04 0.03 1.90 1.71 0.99 0.91 0.44 0.44 10.10 12.43 1.30 2.76 1.42 0.09
146
Chapter 4
Table 4.5 cont. Il001 F1CATlOli IIIlICES FOIl THETA-DELTA ITEM\4 ITEM14 ITEM16 ITEM20 ITEMS ITEM10 ITEJII11 ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEM12 ITEM17 ITEM18 ITEM\9 ITEM2'
1.52 0.2\ 0.76 1.91 1.46 1.20 0.01 0.15 0.29 1.12 0.05 0.46 3.06 3.28 0.16
ITEM\6
0.2\ 18.40 2.16
0.74
4 . 80 0.79 0.02 0.25 5.71 1.08 1.44 0.29 0.02 2.39
ITEMZO
O. '4 2.08 0.04 0.01 0 . 03 0.Z6 0.04 1.27 1.30 1.01 0.00 0.52 0 . 34
ITEMS
IlElnO
ITEM"
0.0\ 4.67 0.28 1.83 0.33 0.34 1.21 0.96
67. 62 6.76 5. 44 0.17 5. 87 0.88 2. 10 6.06 4.04 0.04 0.17
8 . 32 2 . 45 0 . 30 0 . 09 0 . 04 3.38 4 . 38 1.99 2 . 18 0.35
3.8' 20 . 23 25 . 77
10 .86
EXPECTED CIWlGE FOIl THETA-DELTA
ITEM'
ITEJi12
ITEM3 ITEM6 ITEM8 ITEM13 ITEM14 ITEJII16
ITEJII1
ITEM2
ITEM3
ITEJ116
ITEMS
ITEM13
0.53 0.11 - 0 . 18 0.02 '0.08 0.07 ·0.23
0 . 06 ' 0.17 0 . 02 '0.15 0.25 -0.23
0.03 0.07 '0.12 0.02 -0 . 16
- 0.13 ' 0.06 -0.21 0.89
'0.08 -0.13 -0.04
0.27 - 0 . 04
ITEM20
ITEMS
ITEJll10
ITEMI I
-0.02 0. 07 0. 01
- 0.13 '0.30
0 . 69
EXPECTED CIWlGE FOIl THETA-DELTA
ITEM14 ITEM16 ITEM20 ITEMS ITEM10 ITEJII11
ITEM14
ITEM16
-0.09 -0.03 -0 . 07 -0 . 09 0 . 08
0 . 02 0. 27 0.08
-0.05
IMXIIIII IlOO 1FI CAnOlI IIIlEX IS
179_98 FOIl ELEJIIEIT ( 8, 4) OF THETA- DELTA
Modification Indices. Based on our CFA model, the factor loadings and error terms that were fixed to a value of 0.0 are of substantial interest; large MIs would argue for the presence of factor crossloadings (i.e., a loading on more than one factor) and error covariances, respectively. Although a review of the MIs in the factor loading matrix reveals several that are substantially high, I draw your attention to only the two highest values-those associated with Items 12 and 16, respectively. Both represent a substantial misspecification of the hypothesized factor loading. Such misspecification, for example, could mean that Item 12, in addition to measuring personal accomplishment, also measures emotional exhaustion; alterna-
Application 2
147
tively, it could indicate that although Item 12 was postulated to load on the PA factor, it may load more appropriately on the EE factor. In the interest of space, only portions of the error-covariance matrix are shown in Table 4.5; included, however, are the highest MI values in terms of misspecified parameters. Turning to this abbreviated presentation of MIs and expected change statistics, we see very clear evidence of misspecification associated withthepairingofItems 1 and 2 {MI-102.98),Items6and 16 (MI-179.98), and Items 10 and 11 (MI- 67.62). These MIs are substantially larger than those remaining, and represent misspecified error covariances;6 note that they mirror the information reviewed earlier with respect to the residuals. These measurement error covariances represent systematic, rather than random, measurement error in item responses and may derive from characteristics specific either to the items, or to the respondents (Aish & joreskog, 1990). For example, if these parameters reflect item characteristics, they may represent a small omitted factor. If, on the other hand, they represent respondent characteristics, they may reflect bias such as yea/nay-saying, social desirability, and the like (Aish & joreskog, 1990). Another type of method effect that can trigger correlated errors is a high degree of overlap in item content. Such redundancy occurs when an item, although worded differently, essentially asks the same qestion. Indeed, based on my work with theMBI (Byrne, 1991, 1993, 1994b), I believe the latter to be true with respect to this instrument.
POST HOC ANALYSES Provided with information related both to model fit and possible areas of model misspecification, a researcher may wish to consider respecifying an originally hypothesized model. As emphasized in chapter 3, should this be the case, it is critically important that he or she be cognizant of both the exploratory nature of, and the dangers associated with, the process of post-hoc model fitting. (For greater discussion of this topic see chapter 3; see also, e.g., Anderson & Gerbing, 1988; Cliff, 1983; Cudeck & Browne, 1983; joreskog, 1993; MacCallum, 1986.) For didactic purposes in illustrating the various aspects of post hoc model fitting, we will proceed now to respecify the initially hypothesized model of MBI structure, taking into account the above misspecification information presented earlier. 6Although they technically represent error covariances, these parameters are typically referenced as correlated errors.
148
Chapter 4
In the original paper from which this example is taken, information derived from both exploratory and confirmatory factor analyses of the MBI led me to conclude that Items 12 and 16 may be inappropriate for use with teachers. As a consequence, I respecified the model with these items deleted (see Anderson & Gerbing, 1988); all subsequent analyses in this chapter are based on this 20-item revision, which is labeled here as Model 2. Let us turn now to the input file for this respecified model which is shown in Table 4.6.
LlSREL Input File (Model 2) At least three important aspects of this file are worthy of note. First, you will note that Items 12 and 16 have been deleted from the variable list and, thus, a slash has been placed at the end of the SE line. Second, the specified number of variables has been changed to 20 from 22 (NX-20). Finally, as a consequence of the two-variable deletion, the parenthesized coordinate points for most of the parameters have necessarily changed. For example, whereas the first of each congeneric set of parameters constrained to 1.00 were LX(1,1), LX(10,2), and LX(15,3) for the initially hypothesized 22-item scale, the constraints for Model 2 are placed on LX(1,1), LX(9,2), and LX(14,3). Let us turn now to Table 4.7, which contains the goodness-of-fit statistics for this specified model.
LlSREL Output File (Model 2) Although a review of the various fit indices for this model are substantially improved over those for the initial model, there is still some evidence of misfit in the model. For example, the CFI and GFI values of .90 are marginally adequate at best. To pinpoint areas of misspecification in the model, we turn now to the MIs that are shown in Table 4.8.
Table 4.6 Selected LISREL Input for Model 2 BE
1 2 3 6 8 13 14 20 5 10 11 15 22 4 7 9 17 18 19 21/ HO NX=20 NK=3 LX=FU,FI PH=SY,FR TD=SY LK EE DP PA FR LX(2,1) LX(3,1) LX(4,1) LX(5,1) LX(6,1) LX(7,1) LX(8,1) FR LX(10,2) LX(11,2) LX(12,2) LX(13,2) FR LX(15,3) LX(16,3) LX(17,3) LX(18,3) LX(19,3) LX(20,3) VA 1.0 LX(l,l) LX(9,2) LX(14,3) OU
Application 2
149
Table 4.7 Selected lISREL OutPUt for Model 2: Goodness'of'Fit Statistics
=
=
CHI'SQUARE WITH 167 DEGREES OF FREEDOM 618.52 (P 0.0) ESTIMATED NON'CENTRAlITY PARAMETER (NCP) = 451.52 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) EXPECTED CROSS'VALlDATlON INDEX (ECVI)
=
= 0.068
1.22
STANDARDIZED RMR = 0.057 GOODNESS OF FIT INDEX (GFI) = 0.90 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.87 COMPARATIVE FIT INDEX (CFI)
=
0.90
CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P'VALUE FOR CHI·SQUARE
Table 4.8 Selected lISREL Oute!!t for Model 2: Modification Indices lOll FlCATlIII IlII1CES Fill THETA· DELTA ITEln ITEM 1
1TE112
ITEM3 ITEM6 ITEM8 ITEM13 ITEM14 ITEM20 ITEMS ITEM10 ITEM11 ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEM17 ITEM18 ITEM19 ITEM21
89.54 0.99 3.38 0.78 4.68 0.09 19.97 1.05 0.26 1.19 0.00 3.43 0.71 3.06 0.14 0.06 0.44 0.29 0.70
ITEM2
ITEM3
ITEM6
0.02 3.38 0.49 12.02 9.75 23.16 2.97 0.64 1.14 0.51 1.22 0.05 3.02 2.29 1.30 4.71 0.17 0.00
2.74 0.73 4.67 0.04 1.84 7.34 5.51 0.15 0.35 0.06 0.73 0.02 1.37 2.08 1.40 3.13 0.11
0.01 0.23 2.97 0.01 26.18 0.61 2.33 5.22 0.49 2.34 0.85 5.86 1.35 0.02 4.81 2.42
ITEM8
1.87 6.20 17.46
ITEM13
1.82 3.62 0.65 5.01 0.00 0.46 1.21 3.94 1.06 6.35 0.53
12.39 18.11 0.33 2.11 1.31 0.67 0.95 0.13 0.15 7.12 2.54 1.60 0.62 0.14
ITEM11
ITEM15
0.59
0.09
2.n
IOl I FI CATlIII IIIIICES Fill THETA'DELTA ITEM14
ITEM20
ITEMS
ITEM10
ITEM14 ITEM20 ITEMS ITEM10 ITEM11
0.28 0.33 1.58 1.35
0.00 2.61 0.03
3.83 19.41
68.45
ITEM21
0.08
0.11
0.84
0.26
IlAXIIIII IOlIFlCATlIII IIIIEX IS
89.54 Fill ELElENT ( 2. 1) OF THETA-DELTA
150
Chapter 4
As might be expected from our earlier work, the constrained parameters exhibiting the highest degree of misfit lay in the TO matrix and represent correlated errors between Items 1 and 2 (M1-89.54) and between Items 10 and 11 (MI- 68.45). Compared with MI values for all other parameters in the TD matrix, these are exceptionally high, and thus clearly in need of respecification. Unquestionably, the specification of correlated error terms for purposes of achieving a better fitting model is not an acceptable practice; as with other parameters, such specification must be supported by a strong substantive rational, empirical rationale, or both Goreskog, 1993). It is my belief that such a case exists here. Specifically, the correlated errors in the present context replicate those from my other validation work with the MBI (Byrne, 1991, 1994b), thereby arguing for the stability of this finding. Based on this logical and empirical rationale, the model was subsequently respecified with these two parameters freely estimated. It is important to note, however, that this process involved two separate model specifications and estimations; the first respecification (Model 3) incorporated only the error correlation between Items 1 and 2 (represented by the largest MI), whereas the second (and final) specification additionally incorporated the error correlation between Items 10 and 11. Because MI values are calculated univariately and thus can fluctuate from one estimation to another, it is important that model respecification address one MI at a time. The input file for this final model ofMBI structure (Model 4) is shown in Table 4.9.
LlSREL Input File (Final Model) In reviewing Table 4.9, you will quickly note that there are only two differences between this input file for the final model, and the one for Model 2 (see Table 4.6). The first of these relates to the specification of two error covariances-one between Items 1 and 2, and one between Items 10 and 11. The estimation of these two correlated errors is made possible by specifying TO(2,1) and TO(11,10) as free parameters in the model (see the third line from the bottom of the file). The second variation is the specification of SC on the OU line. The SC keyword requests that the completely standardized solution be included in the output file. Selected portions of this output file are presented in Tables 4.10 and 4.11. We turn first to Table 4.10 which summarizes the goodness-of-fit indices.
Application 2
151
Table 4.9 Selected LISREL Input for Final Model of MBI Structure
BB
1 2 3 6 8 13 14 20 5 10 11 15 22 4 7 9 17 18 19 21/ NO NX=20 NK=3 LX=FU,FI PH=SY,FR TO=SY LK
EE FR FR FR FR VA OU
OP PA LX(2,1) LX(3,1) LX(4,1) LX(5,1) LX(6,1) LX(7,1) LX(8,1) LX(10,2) LX(11,2) LX(12,2) LX(13,2) LX(15,3) LX(16,3) LX(17,3) LX(18,3) LX(19,3) LX(20,3) TO(11,10) TO(2,1) 1.0 LX(l,l) LX(9,2) LX(14,3) SC
LlSREL Output File (Final Model) A review of the fit indices presented in Table 4.10 for the Final Model, and a comparison of these values with those summarized in Table 4.7 for Model 2 (the initial20-item model of MBI structure), reveals the final model to be a substantially better fitting model. In particular, note the improved CFI value of .93 (versus.90), and the drop in RMSEA (.057 vs .068) and ECVI (1.22 vs .98) values. Table 4.10 Selected L1SREL OutPUt for Final Model: Goodness-of-fit Statistics CHI·SQUARE WITH 165 DEGREES OF FREEDOM = 474.88 (P = 0.0) ESTIMATED NOM-CENTRALITY PARAMETER (NCP) = 309.88 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.057 EXPECTED CROSS·YALlDATlON INDEX (ECYI)
=
0.98
STANDARD I ZED RMR = 0.051 GOODNESS OF FIT INDEX (GFI) = 0.92 ADJUSTED GOODNESS OF FIT INDEX (AGF I) = 0.90 COMPARATIYE FIT INDEX (CFI) = 0.93 CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P-YALUE FOR CHI-SQUARE
In assessing the extent to which a respecified model exhibits improvement in fit, however, it has become customary to determine if the difference in fit between the two models is statistically significant. As such, the researcher examines the difference in (L\X~ between the two models. Doing so, however, 7 presumes that the two models are nested. The differential between the models
1:
7Nested models:lre hier:uchically related to one another in the sense that one is a subset of another; for example, particu1:1r parameters are freely estimated in one model, but fixed to zero in a second model (Bender &'Chou, 1987; Bollen, 1989b). Thus, it would be inappropriate to compare this final model with the initially hypothesized model which was based on a 22-item MBI structure,
152
Chapter 4
represents a measurement of the overidentifying constraints and is itself X2-distributed, with degrees of freedom equal to the difference in degrees of freedom (~df); it can thus be tested statistically, with a significant ~X2 indicating substantial improvement in model fit. Comparison of our final model (X2(16S) = 474.88) with Model 2 (x2(167) - 618.52), for example, yields a difference in X2 values of 143.64 (~X2(2) - 143.64).8 Admittedly, the estimated noncentrality parameter (NCP=309.88) suggests that there is still some misfit in the model; in addition, the program still reported an inability to compute confidence intervals. However, because the most outstanding misfitting parameters, at this point, represented correlated errors among MBI items, I considered it inappropriate to continue fitting the model. Although familiarity with MBI item content provides clear evidence as to why these correlated errors are so prominant, post hoc fitting of the model to these parameters is open to severe criticism (see, e.g., Cliff, 1983). Thus, presented with the set of goodness-of-fit statistics presented in Table 4.10, I considered the final model to represent a substantively reasonable and statistically adequate model ofMBI structure for elementary school teachers; no further post hoc analyses were therefore conducted. (see Fig. 4.2). The second point of interest in the output file for the final model related to provision of the standardized estimates. We turn now to Table 4.11, in which these values are summarized. Recall from chapter 3 that, in contrast to previous versions of the program, LISREL 8 standardizes all latent variables by default. As a consequence, if no specification is made on the OU line, the output will automatically include what is termed the standardized solution (SS). This solution results from the rescaling of the latent variables to have a variance of 1.00. Thus, although you will readily recognize the PH matrix as being a correlation matrix (i.e., a standardized covariance matrix) in which the diagonal elements (i.e., the factor variances) are 1.00 and the off-diagonals represent correlations among the factors, you may fmd the estimates in the LX matrix to be somewhat strange. However, these estimates derive from the fact that in this operation, the observed variables remain in their own metric; this SS solution is shown in Table 4.11. If, on the other hand, you wish also to have the observed variables rescaled, the correct estimates are obtained in the Completely Standardized Solution; 8Two parameters previously specified as fixed in the initially hypothesized model (Model 2) were specified as free in the final model, thereby using up two degrees of freedon (i.e., two less degrees of freedom).
Application 2
153
Table 4.11 Selected L1SREl Outl!!!t for Final Model: Standardized Estimates STAIIIARD I ZED SOlUTlOII
lAMBDA-X EE ITEM1 ITEM2 ITEMl ITEM6 ITEM8 ITEM13 ITEM14 ITEM20 ITEMS ITEM10 ITEM11 ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEM17 ITEM18 ITEM19 ITEM21
1.25 1.14 1.28 1.00 1.56 1.32 1.19 1.07
DP
0.97 0.97 0.98 0.64 0.61
PA
0.45 0.47 0.76 0.61 0.76 0.74 0.55
PHI
EE DP PA
EE
DP
PA
1.00 0.67 -0.41
1.00 -0.56
1.00
as was shown in Table 4.9, this solution is indicated by the inclusion of the keyword SC on the OU line. Estimates from this solution are shown in the second section of Table 4.11. In reviewing these standardized estimates, you will want to verify that particular parameter values are consistent with the literature. For example, within the context of the present application, it is of interest to inspect correlations among the MBI factors for consistency with their previously reported values; in the present example, these estimates are as expected. In the interest of space, the values in the TD matrix have been reformatted from the output so that they could be presented in single lines, rather than in a triangular matrix. Finally, provided with this solution, we are able to see that the size of the correlated errors between Items 1 and 2 (r - .19) and between Items 10 and 11 (r - .25) are quite substantial.
Table 4.11 cont. IDIPLETEL Y STAIIIARDIZED SOLUTION LAMBDA-X EE ITEMl ITEM2 ITEM3 ITEM6 ITEM8 ITEM13 ITEM14 ITEM20 ITEMS ITEM10 ITEMll ITEM15 ITEM22 ITEM4 ITEM7 ITEM9 ITEM17 ITEM18 ITEM19 ITEM21
DP
0.75 0.73 0.75 0.60 0.86 0.76 0.66 0.75
0.65 0.66 0.65 0.57 0.39
PA
0.48 0.54 0.57 0.69 0.63 0.65 0.42
PHI
EE DP PA
EE
DP
PA
1.00 0.67 -0.41
1.00 -0.56
1.00
THETA-DEL TAa ITEMl
ITEM2
ITEM3
ITEM6
ITEM8
0.43
0.47
0.44
0.63
0.26
ITEM14
ITEM20
ITEMS
ITEM10
ITEMll
0.57
0.44
0.57
0.58
0.58
ITEM22
ITEM4
ITEM7
ITEM9
ITEM17
0.84
0.77
0.71
0.68
0.52
ITEM19
ITEM21
0.58
0.82
ITEM 13 0.42 ITEM15 0.67 ITEM18 0.61
THETA-DELTA COVARIANCES ITEMll
ITEM2 ITEMl
0.19
ITEM10
0.25
a Matrix of Theta-Delta estimates reforll'l.Jlated from the actual output file for reasons of space l i mi tat ions
154
X, ITEM I
5,
X, ITEM 2
&
X, ITEM 3
5,
.
X, ITEM 6
S.
1."
X, ITEMS
&
X, ITEM 13
S.
X, ITEM 14
&
X, ITEM 20
S.
X, ITEMS
.s.
X 10 ITEM 10
5.
XII ITEM 11
-Oil
X" ITEM IS
·5,
X" ITEM 22
·5.
X" ITEM 4
·5,
X" ITEM 7
·5,
X" ITEM 9
·5.
XI1 ITEM!7
·5,
ITEM 18
·5.
X" ITEM 19
·5.
ITEM 21
. Ii.
).
Emotional Exhaustion ~,
.p"
Depersonalization ~,
A,u
~.
Personal Accomplishment ~,
A"
X II
X~
FIG. 4.2. Final Model of Factorial Structure for the Maslach Inventory (20 items).
155
156
Chapter 4
FROM THE SIMPLIS PERSPECTIVE Two example SIMPLIS files are presented. The first of these relates to the SIMPLIS input file for the originally hypothesized model (i.e., 22-item scale), the second demonstrates one procedure of cross-validation. We turn first to the input file for the initial model specification of MBI structure, which is shown in Table 4.12. Because this file is fairly straightforward, there are really only two important items of note. First, consistent with the LISREL input file (see Table 4.2), the same external covariance matrix was used. Thus in SIMPLIS language, we make the statement "COVARIANCE MATRIX from FILE CVEL1.CMM." Second, in contrast to the example provided in chapter 3, the present SIMPLIS file requests the output to be printed in Table 4.12 SIMPLIS Input for Initially Hypothesized Model of MBI structure ! Testing for the Validity of the MBI for Elementary Tchrs "CVELlI" ! Calibration Sample ! Initially Hypothesized Model
OBSERVED VARIABLES: ITEMl ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM7 ITEM8 ITEM9 ITEMIO ITEMll ITEMl2 ITEMl3 ITEMl4 ITEM15 ITEMl6 ITEMl7 ITEMl8 ITEM19 ITEM20 ITEM21 ITEM22 COVARIANCE MATRIX from FILE CVELl.CMM SAMPLE SIZE: 580 LATENT VARIABLES: EE OP PA EQUATIONS: ITEMl l*EE ITEM2 EE ITEM3 EE ITEM6 EE ITEM8 EE ITEMl3 EE ITEMl4 EE ITEMl6 = EE ITEM20 = EE ITEMS = l*OP ITEMI0 = OP ITEMll = OP ITEMl5 = OP ITEM22 OP ITEM4 l*PA ITEM7 PA ITEM9 PA ITEMl2 ·PA ITEMl7 PA ITEM18 PA ITEMl9 PA ITEM21 = PA LISREL OUTPUT: RS MI END OP PROBLEM
Application 2
157
matrix form; thus the command "LISREL OUTPUT," has been included in the file. An important point to note with this command, however, is that whereas the MIs are produced by default for a LISREL input file, they must be specifically requested when a SIMPLIS input file is implemented. Consequently, MI, as well as RS, has been included here. We turn now to the second example SIMPLIS file as it relates to the topic of cross-validation.
Cross-validation in Covariance Structure Modeling Typically, in applications of covariance structure modeling the researcher tests an hypothesized model and then, from an assessment of various goodness-of-fit criteria, concludes that a statistically better fitting model could be attained by respecifying the model such that particular parameters previously constrained to zero are freely estimated (Breckler, 1990; MacCallum et al., 1992, 1993; 1994). Possibly as a consequence of considerable criticism of covariance structure modeling procedures over the past decade (Biddle & Marlin, 1987; Breckler, 1990; Cliff, 1983), most researchers who proceed with this respecification process are now generally familiar with the issues. In particular, they are cognizant of the exploratory nature of these follow-up procedures, as well as the fact that additionally specified parameters in the model must be theoretically substantiated. Indeed, the pros and cons of post hoc model fitting have been rigorously debated in the literature. Although some have severely criticized the practice (e.g., Cliff, 1983; Cudeck & Browne, 1983), others have argued that as long as the researcher is fully cognizant of the exploratory nature of his or her analyses, the process can be substantively meaningful because practical as well as statistical significance can be taken into account (Byrne, Shavelson, & Muthen, 1989; Tanaka & Huba, 1984). However,Joreskog (1993, p. 298) was very clear in stating that "if the model is rejected by the data, the problem is to determine what is wrong with the model and how the model should be modified to fit the data better." Although the purists would argue that once an hypothesized model is rejected, that is the end of the story, other researchers in this area are more reasonable in recognizing the obvious impracticality in the termination of all subsequent model analyses. Clearly, in the interest of future research, it behooves the investigator to probe deeper into the question of why the model is malfitting (see Tanaka, 1993). As a consequence of the concerted
158
Chapter 4
efforts of statistical experts in covariance structure modeling in addressing this issue, there are now several different approaches that can be used to increase the soundness of findings derived from these post hoc analyses. Undoubtedly, post hoc model fitting in the analysis of covariance structures is problematic. With multiple model specifications, there is the risk of capitalization on chance factors because model modification may be driven by characteristics of the particular sample on which the model was tested (e.g., sample size, sample heterogeneity; MacCallum et al., 1992). As a consequence of this sequential testing procedure, then, there is increased risk of making either a Type I or Type IT error and, at this point in time, there is no direct way to adjust for the probability of such error. Because hypothesized covariance structure models represent only approximations of reality and thus are not expected to fit real-world phenomena exactly (Cudeck & Browne, 1983; MacCallum et al., 1992), most research applications are likely to require the specification of alternative models in the quest for one that fits the data well (Anderson & Gerbing, 1988; MacCallum, 1986). Indeed, this aspect of covariance structure modeling represents a serious limitation and, to date, several alternative strategies for model testing have been proposed (see, e.g., Anderson & Gerbing, 1988; Cudeck & Henly, 1991; MacCallum, 1995; MacCallum et al., 1992, 1993). One approach to addressing problems associated with post hoc model fitting is to employ a cross-validation strategy whereby the final model derived from the post hoc analyses is tested on a second (or more) independent sample(s) from the same population. Barring the availability of separate data samples, albeit a sufficiently large sample, one may wish to randomly split the data into two (or more) parts, thereby making it possible to cross-validate the findings (see Cudeck & Browne, 1983). As such, Sample 1 serves as the calibration sample on which the initially hypothesized model is tested, as well as any post hoc analyses conducted in the process of attaining a well-fitting model. Once this final model is determined, the validity of its structure can then be tested based on Sample 2 (the validation sample). In other words, the final best-fitting model for the calibration sample becomes the hypothesized model under test for the validation sample. Although there are several ways by which the similarity of model structure can be tested (see, e.g., MacCallum et al., 1994), Cudeck and Browne (1983) suggested the computation of a Cross-Validation Index (CVI) that measures the distance between the restricted (i.e., model-imposed) vari-
Application 2
159
ance-covariance matrix for the calibration sample and the unrestricted variance- covariance matrix for the validation sample. Because the estimated predictive validity of the model is guaged by the smallness of the CVI value, evaluation is facilitated by their comparison based on a series of alternative models. It is important to note, however, that the CVI estimate reflects overall discrepancy between "the actual population covariance matrix, S, and the estimated population covariance matrix reconstructed from the parameter estimates obtained from fitting the model to the sample" (MacCallum et al., 1994, p. 4). More specifically, this global index of discrepancy represents combined effects arising from the discrepancy of approximation (e.g., nonlinear relations among variables) and the discrepancy of estimation (e.g., sample size). (For a more extended discussion of these aspects of discrepancy, see Browne & Cudeck, 1989; Cudeck & Henly, 1991; MacCallum et al., 1994.) This two-sample cross-validation strategy can be conducted using LISREL 8 by saving the restricted variance-covariance matrix (Sigma) specific to each model tested for the calibration sample. The input file for saving the sigma matrix is shown in Table 4.13. Although the input file illustrated is based on the final model, those for Models 2 and 3 were used to obtain the related sigma matrix for each of these models; each sigma matrix must be save in a separate file. As can be seen on the OU line, the sigma matrix for the final
Table 4.13 LISREL Input for saying Sigma Matrix Testing for Validity of the MEl for Elementary Tchrs "CVELIFSI" Calibration Sample - Based on Computed Covariance Matrix Final Model - 3 correlated errors (ITEMIO/ll; ITEM1/2) - Items 16 and 12 deleted Saving sigma for Cross-validation with sample 2 (CVEL2) DA NI=22 NO=580 MA=CM
LA
ITEMl ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM7 ITEM8 ITEM9 ITEMI0 ITEMII ITEM12 ITEM13 ITEM14 ITEM15 ITEM16 ITEM17 ITEM18 ITEM19 ITEM20 ITEM21 ITEM22 CM=CVELI . CMM SE 1 2 3 6 8 13 14 20 5 10 11 15 22 4 7 9 17 18 19 21/ MO NX=20 NK=3 LX=FU,FI PH=SY,FR TD=SY LX EE OP PA FR LX(2,1) LX(3,1) LX(4,1) LX(5,1) LX(6,1) LX(7,1) LX(8,1) FR LX(10,2) LX(11,2) LX(12,2) LX(13,2) FR LX(15,3) LX(16,3) LX(17,3) LX(18,3) LX(19,3) LX(20,3) FR TD(ll,lO) TD(1,2) VA 1.0 LX(l,l) LX(9,2) LX(14,3) OU SI=CVELIF.SIG
160
Chapter 4
Table 4.14 SIMPLIS Input for Cross-validation of Final Model Across Sample 2 ! Testing for Validity of the MBI for Elementary Tchrs ICVELlFCV" ! Cross-validating Final Model on Sample 2
OBSBRVBD VARIABLBS: ITEMl ITEM2 ITEM3 ITEM4 ITEM5 ITEM6 ITEM7 ITEM8 ITEM9 ITEMlO ITEMll ITEMl3 ITEMl4 ITEMl5 ITEMl7 ITEMl8 ITEMl9 ITEM20 ITEM2l ITEM22 COVARIANCE MATRIX from FILE CVEL2.CMM SAMPLB 'iIZB: 579 CROSSVALIDATB file CVELlF.SIG END OF PROBLEM
model was saved as an external file by simply specifying SI=CVEL1F.SIG, where CVEL1F.SIG represents the file name. Once the sigma matrices have been saved, it is then possible to proceed with the cross-validation process. The input file for the validation of the final sample has been structured as a SIMPLIS file and is shown in Table 4.14. Two essential aspects of this file are important to note. First, the estimation of the model is based on data for the validation sample (Sample 2); this is evident from the statement "COVARIANCE MATRIX from FILE CVEL2.CMM." Second, the statement "CROSSVALIDATE file CVEL1F.SIG" indicates that the sigma matrix based on the final model for the calibration sample provides the structure to be imposed on the variance-covariance matrix for the validation sample. Likewise, the sigma matrix for Models 2 and 3 would be tested on the Sample 2 data. The results of these three cross-validation runs are presented in Table 4.15. Model 2 represents the initial 20-item MBI structure; Model 3 is the same model, except for the inclusion of one correlated error (between Items 10 and 11); the final model includes the specification of an additional correlated error (between Items 1 and 2). For each model tested, the program provides a CVI value, along with a 90% confidence interval. In a comparison of CVI values, these findings suggest that Model 3 demonstrates the highest degree of predictive validity. In other words, of the three models, Model 3 is expected to replicate best for other independent samples from the same population. Indeed, these results exemplify an important aspect of the CVI observed by Cudeck and Browne (1983), Browne and Cudeck (1989), and Cudeck and Henly (1991)-that the model exhibiting the
161
Application 2
Table 4.15 SIMPLIS OutPUt for Cross-val idation of MSI Models Across Sanple 2 Model 2
3 Final
Cross-val idation Statistics CROSS-VALIDATION INDEX (CVI) = 8.96 90 PERCENT CONFIDENCE INTERVAL FOR CVI = (7.96 ; 10.03) CROSS-VALIDATION INDEX (CVI) = 8.69 90 PERCENT CONFIDENCE INTERVAL FOR CVI = (7.70 ; 9.76) CROSS-VALIDATION INDEX (CVI) = 8.75 90 PERCENT CONFIDENCE INTERVAL FOR CVI = (7.75 ; 9.81)
best evidence of replicability is not necessarily the one that is most complex (i.e., the one having the highest number of parameters to be estimated). Rather, models with the highest potential for replicability are a function of both model complexity and sample size. More specifically, whereas crossvalidation is optimal for simple models when sample size is small, it is optimal for more complex models when sample size is large {MacCallum et al., 1994}. Furthermore, degree of model misspecification and factor loading sizes have also been reported to affect cross-validation results (Bandalos, 1993). Thus, although Model 3 seemingly showed the highest potential to replicate across samples from the same population, it is possible that, given a larger sample size, the final model of MBI structure may well demonstrate the best evidence of cross-validity. As with other forms of assessment based on statistical criteria, Browne and Cudeck (1989) cautioned that one should not automatically choose the model that yields the smallest CVI value. Rather, other considerations that include the plausibility of the model and the substantive meaningfulness of the model parameters should be taken into account.
APPLICATIONS RELATED TO OTHER DISCIPLINES Education: O'Grady, K. E. (1989). Factor structure of the WISC-R. Multivariate Behavioral Research, 24,177-193.
162
Chapter 4
Kinesiology: Pelletier, L. G., Vallerand, R. J., Briere, N. M., Tuson, K. M., & Blais, M. R. (1995). The Sport Motivation Scale: A measure of intrinsic, extrinsic, and amotivation in sport. Journal o/Sport and Exercise Psycholop;y, 17,35-53. Medicine: Coulton, C. J., Hyduk, C. M., & Chow, J. C. (1989). An assessment of the Arthritis Impact Measurement Scales in 3 ethnic groups. Journal o/Rheumatolop;y, 16, 1110-1115.
CHAPTER
5
Application 3: Testing the Factorial Validity of Scores From a Measuring Instrument (Second-Order CFA Model) Whereas the two previous applications focused on CF A first-order models, the present application examines a CF A model that comprises a second-order factor. As such, we test hypotheses related to the Beck Depression Inventory (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) as it bears on the nonclinical adolescent population. The example is taken from a study by Byrne, Baron, and Campbell (1993) and represents one of a series of studies that have tested for the validity of second-order BDI factorial structure for high school adolescents in Canada (Byrne & Baron, 1993, 1994; Byrne, Baron, & Campbell, 1993, 1994), Sweden (Byrne, Baron, Larsson, & Melin, 1995, 1996), and Bulgaria (Byrne, Baron, & Balev, 1996, in press). Although the purposes of the Byrne et al. (1993) study were to cross-validate and test for an invariant BDI structure across gender for Canadian high school students, we focus only on factorial validity as it relates to the female calibration sample. (For further details regarding the sample, analyses, and results, readers are referred to the original article.) As noted in chapter 4, the CF A of a measuring instrument is most appropriately conducted with fully developed assessment measures that have demonstrated satisfactory factorial validity. Justification for CFA procedures in the present instance are based on evidence provided by Tanaka and Huba (1984), and on the aforementioned studies by Byrne and associates that BDI score data are most adequately represented by an hierarchical factorial 163
164
Chapter 5
structure. That is to say, the first-order factors are explained by some higher order structure which, in the case of the BDI, is a single second-order factor of general depression. Let us turn now, then, to a description of the BDI, and its postulated structure.
THE MEASURING INSTRUMENT UNDERSTUDY The Beck Depression Inventory (BDI) is a 21-item scale that measures symptoms related to cognitive, behavioral, affective, and somatic components of depression. Although originally designed for use by trained interviewers, it is now most typically used as a self-report measure. For each item, respondents are presented with four statements rated from 0 to 3 in terms of intensity, and asked to select the one which most accurately describes their own feelings; higher scores represent a more severe level of reported depression. BDI item scores, therefore, represent ordinal scale measurement. Joreskog and Sorbom (1993a, 1993b) have argued that when the observed variables in SEM analyses are either all of ordinal scale, or a combination of ordinal and interval scales, the categorical nature of these variables should be taken into account. In particular, they noted that analyses should not be based on Pearson product-moment correlations which assume all variables to have a continuous scale; rather, analyses should be based on either polychoric or polyserial correlations (whichever is appropriate) using weighted least squares (WLS) estimation. Polychoric correlations represent relations between two ordinal variables; in the special case where both variables are dichotomous, they are termed tetrachoric correlations. Polyserial correlations represent the relation between an ordinal and a continuous variable; in the special case where the ordinal variable is dichotomous, the correct term is biserial correlation. In chapters 3 and 4, I discussed the known practical limitations associated with the approach to the SEM analysis of categorical variables advocated by Joreskog & Sorbom (1993a, 1993b, 1993c), and explained why, historically, most researchers have treated the variables as if they were continuous using ML estimation procedures. In addition, I summarized findings from several simulation studies that have investigated the robustness of test statistics when the categorical nature of the observed variables is ignored. This information is
Application 3
165
therefore not repeated here, and readers are referred to these two previous chapters for a more detailed discussion of categorical variable analysis in SEM. For illustrative purposes, the application presented in this chapter addresses the issue of SEM based on categorical data. In particular, I describe how to formulate the polychoric (or polyserial) correlation matrix, and how to implement the correct estimation procedure. Before proceeding with these analyses, however, it is important that we turn first to a brief explanation of this SEM strategy.
Analysis of Categorical Data General Theory In addressing the categorical nature of observed variables, the researcher automatically assumes that each has an underlying continuous scale and, furthermore, that they are multivariate normally distributed (Bentler, 1995). As such, the categories can be regarded as only crude measurements of an unobserved variable that, in truth, has a continuous scale aoreskog & Sorbom , 1993a). Given these assumptions of underlying continuous scale and multivariate normality, SEM analyses of categorical variables automatically provide a threshold for each scale point. In other words, each threshold (or initial point) represents a portion of the continuous scale. For purposes of illustration, let us consider a measuring instrument in which each item is structured on a 3-point scale. Let z represent the ordinal variable (the item), and z* the unobservable continuous variable. The threshold values can then be conceptualized as: If z* ~ 'tI, Z is scored 1 If't, < z* ~ 't2, Z is scored 2 If't2 < z*, z is scored 3 where'tt < 't2 represent threshold values for z*. Because it is assumed that z* is normally distributed, these values are computed from the inverse of the normal distribution normal function aoreskog & Sorbom , 1993a).
The Correlation Matrices As noted earlier, to conduct SEM analyses of categorical variables correctly, these analyses must be based on the polychoric (or polyserial) matrix.
166
Chapter 5
Because these correlations are based on the underlying z* variables rather than the zs (the actual categorial scores), they are regarded as theoretical correlations. In PRELIS 2, these correlations are estimated from the observed pairwise contingency tables of the ordinal variables aoreskog & Sorbom , 1993a). Assuming a bivariate normal distribution for each pair of z*variables, the following types of correlation coefficients can be computed: • • • •
Polychoric-both z variables have an ordinal scale. Tetrachoric-both z variables have a dichotomous scale. Polyserial-one z variable has an ordinal scale, the other has an interval scale. Biserial-one z variable has an interval scale, the other has a dichotomous scale.
Estimation of Parameters As mentioned earlier, the estimation of parameters for a model based on categorical data should be determined using WLS estimation procedures. The weight matrix in this case is represented by the inverse of the estimated asymptotic covariance matrix of the polychoric (or polyserial) correlation coefficients. In other words, the estimation of model parameters requires two matrices, rather than one-the polychoric (or polyserial) matrix, and the asymptotic covariance matrix. These matrices are readily obtained using PRELIS 2. We turn now to these preliminary analyses.
Preliminary Analyses Using PRELIS The procedure for having PRELIS create and save the polychoric (or polyserial) and asymptotic covariance matrices follows the same pattern as was shown in chapter 4 (see Table 4.1), albeit the program is instructed to save both matrices. Because the BDI items have a 4-point scale, they represent ordinal measurement and therefore we need to have the polychoric matrix computed. Estimation of both the polychoric and asymptotic covariance matrices has been improved substantially from the previous PRELIS 1 version (for details, see Joreskog & Sorbom , 1993c).1 The related PRELIS input file for the current BDI data is shown in Table 5.1.
lJoreskog and Sorbom (1993c) further noted that differences in the asymptotic covariance matrix between the two programs has had a major impact on the computation of standard errors and values by USREL 8 when a model is estimated using WLS.
'l
Application 3
167
Table 5.1 PRELIS Input: creation of Polychoric and Asymptotic Covariance Matrices ! FORMULATION OF ASYMPTOTIC MATRIX FOR GIRLS "BOIGIRL.PRE" ! All Variables Treated as Ordinal OA NI=2l NO=32l LA BOIl BOI2 BOI3 BOI4 BOI5 BOI6 BOI7 BOIS BOI9 BOllO BOIll BOll2 BOll3 BOll4 BOll5 BOll6 BOll7 BOIlS BOll9 BOI20 BOI2l RA=BOIGIRL.OAT FO (2lFl. 0) OU MA=PM SM=BOIGIRL.PMM SA=BOIGIRL.ASM
In reviewing this file, we see that there are 21 observed variables and 321 cases; the input data, from which the program will compute the polychoric and asymptotic covariance matrices, is a raw data file labeled "BDIGIRL.OAT" with a fixed format statement which indicates that only the 21 BDI item scores are included in the file and each occupies one column only (21F1.0). The variable labels are consistent with the order in which the BOI items were entered into the data set. Finally, because no specification has been made regarding the scale of the variables, the program will automatically treat them as ordinal data (the default specification). Now let us take a close look at the OU line. The statement MA=PM requests that the polychoric matrix be computed. This specification remains the same for all categorical matrices (i.e., polychoric, polyserial, tetrachoric, biserial); the program automatically computes the correct matrix. The statement SM = BDIGIRL.PMM indicates that the polychoric matrix is to be saved in a file named BOIGIRL.PMM. Finally, the last statement (SA=BDIGIRL.ASM) indicates that the asymptotic covariance matrix is to be saved in a file called "BDIGIRL.ASM.,,2 By default, as illustrated in chapter 3 for continuous variables, PRELIS 2 provides descriptive information related to the categorical data. For space considerations, however, only portions of the output file for the BOI data are presented here. We turn now to Table 5.2, where this information is illustrated. Appearing first in the output file are the threshold values for each variable-in this case, each item on the BOI scale. Given a 4-point scale, there are three threshold values for each item. This information is followed by
2To save the asymptotic variances, the keyword is SV.
Table 5.2 Selected PRELIS OUtPUt: Ordinal Variables TOTAL SAMPLE SIZE = 321 THRESIIOlDS Fill IIIDIIIAl VARIABLES : BOil 0.249 1.339 1.782 1.168 2.018 BOl2 0.414 BOl3 0.573 1.249 2.082 BOl4 -0.059 1.025 1.441 1.710 2.155 BOIS 0.509 BOI6 0.330 1.215 0.889 BOI7 0.066 1.398 1.782 BOI8 -0.208 1.079 1.617 BOl9 0.161 1.745 2.243 0.811 BOlIO 0.113 1.025 BOllI -0.439 0.913 1.168 BOl12 0.717 1.821 BOI13 0.004 0.536 1.910 BOl14 0.161 0.600 1.199 1.038 2.352 BOilS -0.066 BOI16 0.059 1.152 1.617 BOl17 -0.371 1.782 1.266 BOl18 0.216 1.012 1.487 BOI19 1.066 1.677 1.961 BOl20 0.405 1.266 2.018 BOl21 1.122 1.617 2.082 IIIIVARIATE DISTRIIlITIOIIS Fill IIIDIIIAl VARIABlES BOil FREQUENCY PERCENTAGE :BA:R:· 1 192 59.8 2 100 31.2 3 17 5.34 12 3.7B012 FREQUENCY PERCENTAGE :BA:R:CH:A:R:T 66.0 1 212 2 70 21.8 3 32 10.04 7 2.2-
BIVARIATE DISTRIBUTIOIIS Fill IIIDIIIAl VARIABlES (FRECIEICIES) BOl2 BOil 1 2 3 4
168
141 59 7 5
BOI3
2
3
4
38 27 4 1
11 12 5 4
2 2 1 2
152 67 8 3
BOI4
2
3
4
31 22 3 1
8 10 5 5
1 1 1 3
110 38 3 2
2
3
4
65 44 6 4
5 12 5 3
12 6 3 3
Application 3
169
Table 5.2 cont. BIVARIATE DISTRIBUTUIiS Fat atDllW. VARIABLES (PERCENTAGES)
BOl2
BOI4 3
4
43.9 11.8 3.4 0.6 47.4 9.7 2.5 0.3 34.3 20.2 1.6 18.4 8.4 3.7 0.6 20.9 6.9 3.1 0.3 11.813.7 3.7 2.5 0.9 1.6 0.3 0.9 1.9 1.6 2.2 1.2 1.6 0.3 1.6 0.3 1.2 0.6 0.9 0.3 1.6 0.9 0.6 1.2 0.9
3.7 1.9 0.9 0.9
BOl1 1 2 3 4
BOI3
2
3
4
2
3
4
2
aJRRELATUIiS AND TEST STATISTICS (PE=PEARSCII PlHllUCT IOIENT. PC=POlYCIIOIIIC. PS=POLYSERIAL) TEST Of IIIlEL TEST OF CLOSE FIT aJRRELATlON CHI-SIll. D.F. P-VALIE RIISEA P-VALIE
BOI2 BOI3 BOI3 BOl4 BOl4 BOI4
vs. vs. vs. vs. vs. vs.
BOl1 8011 BOI2 BOl1 BOI2 BOl3
0.359 0.447 0.486 0.382 0.285 0.322
aJRRELATION MATRIX
BOl1
BOI1 BOI2 BOl3 BOI4 BOIS BOl6
--:r:ooo 0.359 0.447 0.382 0.335 0.430
(PC) (PC) (PC) (PC) (PC) (PC)
6.146 10.689 8.692 11.924 8.659 17.730
8 8 8 8 8 8
0.631 0.220 0.369 0.155 0.372 0.023
0.000 0.032 0.016 0.039 0.016 0.062
0.929 0.687 0.810 0.605 0.812 0.270
BOl2
BOl3
BOl4
BOIS
BOl6
1.000 0.486 0.285 0.347 0.385
1.000 0.322 0.412 0.436
1.000 0.283 0.314
1.000 0.468
1.000
numerical and graphical summaries of the univariate distributions for each item. Although only two items are described here, most were consistent in demonstrating positive skewness, with most respondents reporting no evidence of depressive symptoms.3 It is important to note that although the latent continuous z* distribution is assumed to be normally distributed, this is not so for the ordinal variables; they can be both skewed and kurtotic, as is the case here for the BDI items (Bollen, 1989b). Two sets of information related to the bivariate distributions are presented next in the output. Reported first for each pair of ordinal variables are the 3Given that the data comprise BDI scores for nonclinical adolescents, these results are not unexpected.
170
Chapter 5
frequencies, followed by their conversion into percentages. For example, in examining the frequency bivariate distribution for BOil and BOU shown in Table 5.2, we see that of the 321 adolescents in the sample, 141 indicated that they did not feel particularly sad or discouraged (Item 1; Category 1); at the opposite end of the scale, two adolescents reported feeling very sad and discouraged most of the time (Item 2; Category 4): Shown next in Table 5.2 are test statistics related to the polychoric correlations, followed by the actual correlation matrix. These statistics derive from testing the hypothesis that the latent bivariate coefficient related to each coefficient (the z*s) is normally distributed. Although only a subset of information from these results is shown in Table 5.2, their tabulation indicated that for 13 of the total 210 bivariate distributions, the hypothesis was rejected at the .01 probability level; another 21 could be rejected at the .05 level. As can be seen in Table 5.2, for example, this was the case for the pairing of BOI4 and BOD. The presentation of the polychoric correlation matrix completes the information provided in the PRELIS output. Now that we have structured the necessary matrices required in the fitting of a model to categorical data, let us turn to the hypothesized model and to the LISREL analyses used in testing this model.
THE HYPOTHESIZED MODEL The model to be tested in the present application derives from the work of Byrne et al. (1993). As such, the CFA model hypothesized a priori that: (a) responses to the BDI could be explained by three first-order factors (Negative Attitude, Performance Difficulty, Somatic Elements), and one second-order factor (General Depression), (b) each item would have a nonzero loading on the first-order factor it was designed to measure, and zero loadings on the other two first-order factors, (c) error terms associated with each item would be uncorrelated, and (d) covariation among the three first-order factors would be explained fully by their regression on the second-order factor. A diagrammatic representation of this model is presented in Fig. 5.1.
~o resolve problems of nonpositive definiteness resulting from a disproportionately high number of zero entries in the data, all scores were monotonically increased by one (see also Tanaka & Huba, 1984).
NEGATIVE
~,
AT11TUOE
n,
SADNESS
X.
PESSOIISM
X.
FAILURE
X.
GUU
X.
PUNISHMENT
x.
SElF-DISUJ."
ACSELF
>."
'.
Al1i
"
ECSELF
SELF RATING
"
MCSELF
'.
~,
scrcH
)." ACADEMlC COMPETENCE
...
" 'w'w
~,
"
AcrCH
...
TEACHER
EcrCH
~
'.
RATING
McrcH
...
...
SCPAR
)."
••
,.
ENGUSH
...
)."
ACPAR
"',...
COMPETENCE
.
",
...
' ..
ECPAR
PARENr
,.
RATING
'"
MCPAR
' ..
••
SCPEER
l.... A.
l.""
MATHEMATICS COMPIITENCE ~
'w
'.
ACPEER
' ..
ECPEER
PEER
RATING
~
'"
MCPEER
' ..
FIG. 6.1 . Hypothesized MTMM Model (Model 1) with LlSREl notation.
195
196
Chapter 6
Table 6.1 PRELIS Input: Data Screening ! Testing MTMM models "MTMM7.PRE" ! Grade 7 Students ! Data Screening DA NI=16 NO=193 LA
SCSELF SCTCH SCPAR SCPEER ACSELF ACTCH ACPAR ACPEER ECSELF ECTCH ECPAR ECPEER MCSELF MCTCH HCPAR MCPEER RA=IND7MT.DAT FO (3F3.1,F5.2,3F3.1,F5.2,3F3.1,F5.2,3F3.1,F5.2) CO ALL OU
the LISREL input related to Fig. 6.1, let us turn our attention first to the PRELIS file used for data-screening purposes. The related input, as shown in Table 6.1, indicates that (a) the data comprise a set of 16 scores based on 193 cases; (b) these data are in the form of a raw matrix, the fixed format of which follows on the next line; and (c) all variables are of a continuous scale (CO-ALL). Turning now to the output file, partially shown in Table 6.2, we see that the descriptive information is very comprehensive. In addition to being provided with the mean, standard deviation, skewness, and kurtosis coefficients, you are also shown how many cases fell at both the low and high ends of the scale and are provided with information related to the statistical significance of the skewness and kurtosis coefficients. As a pictorial summary of distributional information, a histogram is produced for each variable. Considering that a normal distribution of scores is characterized by skewness and kurtosis values approximating zero, a review of the summary statistics shown in Table 6.2 reveals ratings by parents (e.g., ACPAR) to be the primary contributor to the elevated skewness values, and ratings by peers (e.g., ACPEER) to be the major contributor with respect to kurtosis. Nonetheless, from a practical perspective, it seems evident that normality must necessarily be defined within a range that spans either side of zero. Although no definite cutpoints have been firmly advanced as to when scores may _no longer be regarded as normally distributed (Curran et al., 1996), Monte Carlo simulation research has been helpful in suggesting thresholds for the categorization of distributions as normal, moderately nonnormal, and extremely nonnormal. Accordingly, Curran et al., following the pattern used in other Monte Carlo studies, considered scores to be moderately nonnormal if they demonstrated skewness values ranging from 2.00 to 3.00,
Table 6.2 PRELIS OUtl2!!t: Descri!;!tive Statistics TOTAL SAMPLE SIZE =
193
UNIVARIATE SUMMARY STATISTICS FOR CONTINUOUS VARIABLES VARIABLE MEAN ST. DEV. SKEWNESS KURTOSIS MINIMUM FREQ.
MAXIMUM FREQ.
1.000 - - 1 1.000 3 1.000 3 -5.500 2 1.500 1 1.000 2 1.500 2 -1.220 1 1.500 2 1.000 2 1.500 5 -1.000 6 1.000 4 1.000 8 1.000 9 -1.130 2
4.000 ~ 4.000 31 4.000 85 3.310 1 4.000 11 4.000 66 4.000 116 4.080 1 4.000 8 4.000 68 4.000 94 4.730 1 4.000 30 4.000 66 4.000 88 3.840 1
SCSELF SCTCH SCPAR SCPEER ACSELF ACTCH ACPAR ACPEER ECSELF ECTCH ECPAR ECPEER MCSELF MCTCH MCPAR MCPEER
3.032 2.954 3.301 -0.126 3.005 3.249 3.571 0.081 3.119 3.148 3.412 0.041 2.936 3.077 3.306 0.051
0.708 0.686 0.809 1.117 0.575 0.734 0.641 1.036 0.567 0.778 0.696 1.050 0.821 0.856 0.881 1.024
-0_552 -0.299 -1.035 -0.240 -0.196 -0.843 -1.395 1.861 -0.493 -0.395 -0.922 1.764 -0.417 -0.551 -1.235 1.877
-0.463 -0.162 0.155 5.394 -0.531 0.080 0.849 2.784 -0.447 -0.884 -0.157 3.057 -0.714 -0.552 0.446 2.878
TEST OF UNIVARIATE NORMALITY FOR CONTINUOUS VARIABLES SKEWNESS KURTOSIS SKEWNESS AND KURTOSIS Z-SCORE P-VALUE Z-SCORE P-VALUE CHI-SQUARE P-VALUE SCSELF -2.433 0.007 -1.484 0.069 8.125 0.017 SCTCH -1.822 0.034 -0.270 0.394 3.393 0.183 SCPAR -3.073 0.001 0.669 0.252 9.894 0.007 SCPEER -1.613 0.053 5.524 0.000 33.115 0.000 ACSELF -1.426 0.077 -1.828 0.034 5.376 0.068 ACTCH ·2.863 0.002 0.471 0.319 8.418 0.015 0.000 2.081 0.019 15.746 0.000 ACPAR -3.379 ACPEER 3.675 0.000 4.152 0.000 30.744 0.000 ECSELF -2.319 0.010 -1.408 0.080 7.360 0.025 ECTCH -2.096 0.018 -4.305 0.000 22.930 0.000 0.002 -0.252 0.400 8.792 0.012 ECPAR -2.954 ECPEER 3.620 0.000 4.343 0.000 31.962 0.000 MCSELF -2.152 0.016 -2.929 0.002 13.211 0.001 MCTCH -2.431 0.008 -1.938 0.026 9.669 0.008 MCPAR -3.254 0.001 1.343 0.090 12.391 0.002 MCPEER 3.684 0.000 4.219 0.000 31.373 0.000
HISTOGRAMS FOR CONTINUOUS VARIABLES SCSELF FREQUENCY PERCENTAGE LOWER CLASS LIMIT 4 2.1 1.000 1 0.5 1.300 1.600 9 4.7 17 8.8 1.900 19 9.8 2.200 25 13.u 2.500 15 7.8 2.800 58 19.7 3.100 29 15.0 3.400 3.700 36 18.7
197
198
Chapter 6
Table 6.2 cant. SCTCH FREQUENCY PERCENTAGE LOWER CLASS LIMIT
l 3
o
26 Z9
o o
76
Z5 31
1.6 1.6 0.0 13.5 15.0 0.0 39.4 0.0 13.0 16.1
1.000 1.300 1.600 1.900 2.200 2.500 2.800 3.100 3.400 3.700
SCPAR FREQUENCY PERCENTAGE LOIIER CLASS LI MIT
3 11
o 11 o
14 42
o
27 85
1.6 5.7 0.0 5.7 0.0 7.3 21.8 0.0 14.0 44.0
1.000 1.300 1.600 1.900 2.200 2.500 2.800 3.100 3.400 3.700
SCPEER FREQUENCY PERCENTAGE LOWER CLASS LIMIT 2 1.0 '5.500
o o o
16 89 50 20 11 5
0.0 0.0 0.0 8.3 46.1 25.9 10.4 5.7 2.6
'4.619 '3.738 -2.857 -1.976 -1.095 -0.214 0.667 1.548 2.429
and kurtosis values from 7.00 to 21.00; extreme nonnormality was defined by skewness values> 3.00, and kurtosis values> 21.00. Using these categories as guidelines, and with mean skewness and kurtosis values of -.19 and .73, respectively, for all practical purposes, we can likely consider the scores shown in Table 6.2 as generally approximating a normal distribution. Only histograms related to the first four variables are presented in Table 6.2. However, it is interesting to compare the distribution of scores for social competence as perceived through the eyes of the students themselves (SCSELF), their teachers (SCTCH), their parents (SCPAR), and their peers (SCPEER). Whereas self- and teacher rating scores of social competence are relatively normally distributed, those for parents tend to be negatively skewed with minimal kurtosis and those for peers are minimally skewed with moderately high negative kurtosis. Taken together, this information reveals the extent to which ratings of competence, as perceived by different indi-
Application 4
199
viduals, can deviate. For a more detailed look at ratings by self, teacher, parent, and peer, readers are referred to the published article (Byrne & Bazana, 1996). Having reviewed these summary statistics related to the data, let us turn our attention now to the CF A modeling of MTMM data using LISREL 8.
THE CFA APPROACH TO MTMM ANALYSES In testing for evidence of construct validity within the framework of the general CF A model, it has become customary to follow guidelines set forth by Widaman (1985). As such, the hypothesized MTMM model is compared with a nested series of more restrictive models in which specific parameters are either eliminated, or constrained equal to zero or 1.0. The difference in as discussed in earlier chapters, provides the yardstick by which to X2 {LlX~' judge evidence of convergent and discriminant validity. Although these evaluative comparisons are made solely at the matrix level, the CF A format allows for an assessment of construct validity at the individual parameter level; the postulated model provides the basis for this assessment. In testing for evidence of construct validity using the CF A approach, assessment is typically formulated at both the matrix and the individual parameter levels. We examine both in the present application. The MTMM model portrayed in Fig. 6.1 represents the hypothesized model and serves as the baseline against which all other alternatively nested models are compared in the process of assessing evidence of construct validity. Clearly, this CFA model represents a much more complex structure than either of the two comparable models studied earlier (chapters 3 and 4). This complexity arises primarily from the loading of each observed variable onto both a trait and a method factor. In addition, the model postulates that, although the traits are correlated among themselves, as are the methods, any correlations between traits and methods are assumed to be nil. For our purposes here, we proceed first to test for evidence of construct validity at the matrix level, and then move on to an examination of the individual parameters. The four nested models described and tested in the present application represent those most commonly included in CF A MTMM analyses.
200
Chapter 6
THE GENERAL CFA MTMM MODEL Modell: Correlated Traits/Correlated Methods The first model to be tested (Model 1) represents the hypothesized model shown in Fig. 6.1 and serves as the baseline against which all alternative MTMM models are compared. As noted earlier, because its specification includes both trait and method factors and allows for correlations among traits and among methods, this model is typically the least restrictive. The related LISREL input file is presented in Table 6.3.
L1SREL Input File Because the initial file in which no start values were specified took 79 iterations before the solution converged, the input file was subsequently modified to include values based on this first set of estimates. Given that these start values were close to the true ML estimates, their inclusion reduced the number of iterations from 79 to 27. Although the input file for Modell (Table 6.3) follows the same general pattern as those for the CFA models in chapters 3 and 4, there are three differential aspects that are worthy of note. First, on the MO line, you will see that, in contrast to the previous applications, the factor variance-covariance matrix is specified as symmetric and fixed (PH - SY,FI), and the error variance-covariance matrix as diagonal and free (fD-DI,FR). The rationale underlying this specification for the Phi matrix is because most elements in the matrix must be constrained either to 0.0, or to 1.00.J The Theta Delta specification, on the other hand, is tied to the complexity of the MTMM model in that the additional computation of modification indices can often lead to nonconvergent solutions, improper solutions, or both. Thus, to allow for the generation of modification indices unnecessarily complicates an already complex CF A model. Second, unlike our previous examples, all factor loadings are specified as freely estimated. As a consequence, in order for the model to be identified statistically, all factor variances must be constrained (see chapter 1); as indicated by the VA keyword, in the present example they have been fixed at a value of 1.00. Third, note that the only parameters to be freely estimated
JOf course, one could also specify PH-SY,FR, in which case the fixed parameters would need to be specified as such.
Application 4
201
Table 6.3 LlSREl Input for MTMM Model 1 (Correlated Traits/Correlated Methods) ! Test of MTMM Model for Grade 7 "MT7CTCM1" !Start Values Added ! Correlated Trai ts and Correlated Methods DA NI=16 N0=193 MA=CM LA SCSElF SCTCH SCPAR SCPEER ACSELF ACTCH ACPAR ACPEER ECSElF ECTCH ECPAR ECPEER MCSElF MCTCH MCPAR MCPEER RA=IND7MT .DAT FO (3F3.1, FS .Z,3F3. 1, FS .Z,3F3. 1, FS.2,3F3. 1, FS .2) SelECTION SCSElF ACSElF ECSElF MCSElF SCTCH ACTCH ECTCH MCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER MCPEER MO NX=16 NK=S LX=FU,FI PH=SY,FI TD=DI,FR LK SOCCOMP ACADCOMP ENGCOMP MATHCOMP SELFRATG TCHRATG PARRATG PEERRATG FR LX(1,ll LX(S,1) LX(9,1) LX(13,1) FR LX(Z,2) LX(6,Z) LX(10,2) LX(14,2) FR LX(3,3) LX(7,3) LX(",3) LX(1S,3) FR LX(4,4) LX(S,4) LX(1Z,4) LX(16,4) FR LX(I,S) LX(2,S) LX(3,S) LX(4,S) FR LX(S,6) LX(6,6) LX(7,6) LX(S,6) FR LX(9,7> LX(10,7> LX(I',7> LX(12,7> FR LX(13,S) LX(14,S) LX(1S,S) LX(16,S) FR PH(2, 1) PH(3, 1) PH(4,1) PH(3,2) PH(4,2) PH(4,3) FR PH(6,S) PH(7,S) PH(S,S) PH(7,6) PH(S,6) PH(S,7> VA 1.00 PH(I,I) PH(2,2) PH(3,3) PH(4,4) PH(S,S) PH(6,6) PH(7,7> VA 1.00 PH(S,S) ST .9 LX(14,S) ST .5 LX(I,I) LX(9,1) LX(3,3) LX(I',3) LX(4,4) LX(S,4) LX(12,4) ST .5 LX(2,S) LX(6,6) LX(7,6) LX(S,6) LX(12,7> LX(13,S) LX(10,7> ST .5 LX(IS,S) LX(16,S) ST .3 LX(S,I) LX(13,1) LX(10,2) LX(IS,3) LX(16,4) LX(4,S) LX(9,7> ST .3 LX(I',7> ST .1 LX(2,2) LX(6,2) LX(14,2) LX(7,3) LX(I,S) LX(3,S) LX(S,6) ST .S PH(3,2) PH(4,2) ST .5 PH(7,S) PH(7,6) PH(S,6) ST .3 PHC2,I) PH(4,3) PH(6,S) PH(S,S) ST.l PH(3,1) PH(4,1) PH(S,7> ST .S TD(13,13) ST .5 TD(IS,IS) TD(16,16) ST .3 TD(S,S) TD(S,S) TD(9,9) ST .1 TD(I,I) TD(3,3) TD(4,4) TD(7,7> TD(10,10) TO(I',ll) TO(12,12) ST .1 TO(14,14) ST .0 T0(2,2) ST .06 TD(6,6) OU NS AD=OFF
in the Phi matrix are correlations among the first four factors (the traits) and among the last four factors (the methods); all other parameters remain constrained either to zero by default, or to 1.00 in the case of the variances! Finally, on the OU line, you will see AD=OFF. This command requests that the admissibility check be turned off. As noted in chapter 5, because diagonal elements of symmetric covariance matrices must always be positive 4As a consequence of problems related to both the identification and estimation of CFA models, trait-method correlations cannot be freely estimated (see Schmitt & Stults, 1986; Widaman, 1985).
202
Chapter 6
values (Wothke, 1993), the constrained variances in the Phi matrix will render the matrix nonpositive-definite. This condition represents an inadmissible solution and, thus, the program terminates all output subsequent to the goodness-of-fit statistics. To resolve this problem when model specification includes fixed diagonal elements by intent, Joreskog and Sorbom (1996a) advised that the admissibility check should be turned off.
L1SREL Output File From personal experience, I have always found the parameter specifications, as presented in the LISREL output, to be extremely helpful in tracking particular estimated parameters. Thus, this information is presented first, in Table 6.4. This section of the output can also be of substantial value in helping you to visually capture the specification of a model, and is particularly so with respect to MTMM models. For this reason, then, the parameter specification summary for each MTMM model presented throughout this chapter is included in the tabled LISREL output. Before proceeding to the estimate results presented in Table 6.5, however, I strongly urge you to take a few minutes to study the linkage between the model depicted in Fig. 6.1 and the parameter specification presented in Table 6.4. For example, in reviewing the Lambda matrix, you will see that the pattern of factor loadings necessarily differs for traits versus methods; and, in the Phi matrix, it is easy to see the two clusters of estimated parameters representing correlations among traits and among methods. For reasons linked to an improper solution, the estimates presented in Table 6.5 relate to only the Phi and Theta Delta matrices. Of particular import is the negative error variance for the variable" ACSELF" (-0.12) that, as discussed in chapter 5, is indicative of a boundary parameter. It is now widely known that the estimation of negative variances is a common occurrence with applications of the general CF A model to MTMM data; indeed, so pervasive is this problem that the estimation of a proper solution may be regarded as a rare find (see, e.g., Kenny & Kashy, 1992; Marsh, 1989; Wothke, 1987). Although these results can be triggered by a number of factors, one likely cause in the case of MTMM models is the overparameterization of the model (see Wothke, 1993); this condition likely occurs as a function of the complexity of specification. One fruitful approach in addressing this conundrum is to reparameterize the model in the format of the Correlated Uniqueness model (see Kenny, 1976, 1979; Kenny & Kashy, 1992;
Table 6.4 LI SREL OutPUt: Parameter Specifications for HTMH Hodel PARAMETER SPECIFICATIONS LAHBDA-X SDCCOMP
ACADCOMP
ENGCOMP
MATHCOMP
SELFRATG
TCHRATG
1 0 0 0 9 0 0 0 17 0 0 0 25 0 0 0
0 3 0 0 0 11 0 0 0 19 0 0 0 27 0 0
0 0 5 0 0 0
0 0 0 21 0 0 0 29 0
0 0 0 7 0 0 0 15 0 0 0 23 0 0 0 31
2 4 6 8 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 10 12 14 16 0 0 0 0 0 0 0 0
PARRATG
PEERRATG
0 0 0 0 0 0 0 0 18 20 22 24 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 26 28 30 32
SOCCOMP
ACADCOMP
ENGCOMP
MAT HCOMP
SELFRATG
TCHRATG
0 33 34 36 0 0 0 0
0 35 37 0 0 0 0
0 38 0 0 0 0
0 0 0 0 0
0 39 40 42
0 41 43
PARRATG
PEERRATG
0 44
0
SCSELF ACSELF ECSELF HCSELF SCTCH ACTCH ECTCH HCTCH SCPAR ACPAR ECPAR HCPAR SCPEER ACPEER ECPEER HCPEER
13
LAHBDA-X
SCSELF ACSELF ECSELF HCSELF SCTCH ACTCH ECTCH HCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER MCPEER PHI
SOCCOMP ACADCOMP ENGCOMP MATHCOMP SELFRATG TCHRATG PARRATG PEERRATG PHI
PARRATG PEERRATG
203
204
Chapter 6
Table 6.4 cont. THETA-Del TA SCSELf
ACSElF
ECSElF
MCSELf
SCTCH
ACTCH
45
46
47
48
49
50
ECTCH
MCTCH
SCPAR
ACPAR
ECPAR
MCPAR
51
52
53
54
55
56
SCPEER
ACPEER
ECPEER
MCPEER
57
58
59
60
THETA-DELTA
THETA-Del TA
Marsh, 1989; Marsh & Bailey, 1991; Marsh & Grayson, 1995). However, because (a) the Correlated Uniqueness model represents a special case of, rather than a nested model within, the general CFA framework, and (b) I consider it important to work through the nested model comparisons proposed by Widaman (1985), I delay discussion and application of this model until later in the chapter. In an attempt to resolve the presence of the negative error variance for the ACSELF variable, I respecified Modell such that the error variance of this variable was constrained equal to that of ACTCH, which also had a low estimated value. My decision to proceed in this manner was also influenced by the very high correlation between the academic competence factor and the two subject-specific competence factors (r - .89; r - .86). This respecified model yielded a proper solution, the results of which are reported in Table 6.6. As can be seen in Table 6.6, this reestimated model resulted in little change to all parameter estimates in the Theta Delta matrix, yet had the added bonus that all error variance estimates were within the range of acceptable values. (Because the factor loadings are summarized at the end of the chapter, they are not included here.) The fit statistics related to this model indicate an exceptionally good fit to the data. Interestingly, whereas the CFI suggests a perfect fit (CFI-1.00), the GFI seems more realistic in its value of .95. Nonetheless, had additional parameters been added subsequent to post hoc analyses, I would have concluded that the results were indicative of an overfitted model. Because this was not the case, however, I can only presume that the model fits the data exceptionally well. Throughout the remainder of the chapter, then, this modified model Qabeled Model 1A) served as the baseline model.
Table 6.5 LI SREL Outl2!!t: Parameter Estimates for MTMM Model 1 L1SREL ESTIMATES (IIAXIIIII LlICELlIIOIJ)
( • ..mer of Iterations
= 27)
PHI SOCCOMP
ACADCOMP
ENGCOMP
SOCCOMP
1.00
ACADCOMP
0.32 (0.18) 1.79
1.00
ENGCOMP
0.15 (0.12) 1.21
0.89 (0.18) 5.02
1.00
MAT HCOMP
0.21 (0.11 ) 1.85
0.86 (0.17) 5.07
0.38 (0.13) 2.82
MATHCOMP
SELFRATG
TCHRATG
1.00
SELFRATG
1.00
TCHRATG
0.33 (0.11) 3.02
1.00
PARRATG
0.44 (0.15) 2.87
0.57 (0.08) 6.82
PEERRATG
0.31 (0.10) 3.02
0.42 (0.07) 5.66
PHI PARRATG PARRATG
1.00
PEERRATG
0.23 (0.12) 2.03
PEERRATG
1.00
THETA·DELTA SCSELF
ACSELF
ECSELF
MCSELF
SCTCH
ACTCH
0.14 (0.06) 2.33
·0.12 (0.14) ·0.92
0.10 (0.04) 2.29
0.22 (0.04) 5.10
0.35 (0.04) 8.99
0.06 (0.03) 2.23
ECTCH
MCTCH
SCPAR
ACPAR
ECPAR
HCPAR
0.20 (0.03) 6.43
0.30 (0.04) 7.88
0.34 (0.05) 6.33
0.14 (0.03) 5.41
0.19 (0.04) 4.94
0.17 (0.04) 3.92
SCPEER
ACPEER
ECPEER
MCPEER
0.92 (0.10) 9.14
0.10 (0.06) 1.79
0.55 (0.06) 8.41
0.44 (0.06) 7.72
THETA·DEL TA
THETA-DEL TA
205
206
Chapter 6
Table 6.6 L1SREL OUtPUt: Error Variance Estimates and Goodness·of fit Statistics for MTMM Model 1A (Numer of Iterations
L1SREL ESTIMATES (1IAX11UI LIICELlHOIJ)
= 29)
THETA·DELTA SCSELF
ACSELF
ECSELF
MCSELF
SCTCH
ACTCH
0.14 (0.06) 2.47
0.02 (0.02) 1.42
0.11 (0.03) 4.10
0.20 (0.04) 5.48
0.36 (0.04) 8.99
0.02 (0.02) 1.42
ECTCH
MCTCH
SCPAR
ACPAR
ECPAR
MCPAR
0.21 (0.03) 8.11
0.33 (0.04) 8.71
0.34 (0.05) 6.16
0.13 (0.02) 5.84
0.18 (0.03) 5.17
0.12 (0.05) 2.46
SCPEER
ACPEER
ECPEER
MCPEER
0.92 (0.10) 9.10
0.13 (0.06) 2.30
0.53 (0.06) 8.21
0.44 (0.06) 7.65
THETA·DEL TA
THETA-DEL TA
GOOOIlESS OF FIT STATISTICS
CHI-SQUARE WITH 77 DEGREES OF FREEDOM = 77.43 (P = 0.46) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 0.43 90 PERCENT CONFIDENCE INTERVAL FOR NCP = (0.0 ; 25.27) ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.0054 90 PERCENT CONFIDENCE INTERVAL FOR RMSEA = (0.0 ; 0.041) P-VALUE FOR TEST OF CLOSE FIT (RMSEA < 0.05) = 0.99 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 1.02 90 PERCENT CONFIDENCE INTERVAL FOR ECVI = (1.02 ; 1.15) STANDARD I ZED RMR = 0.046 GOODNESS OF FIT INDEX (GFI) = 0.95 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.92 COMPARATIVE FIT INDEX (CFI)
= 1.00
We turn now to the other MTMM models against which ModellA will be compared. In the interest of space, only model specification input, parameter specification output, and goodness-of-fit statistics are tabled for each; Model lA start values were used throughout, and no convergence problems were encountered.
Application 4
207
Model 2: No Traits/Correlated Methods Specification of parameters for this model is presented in Table 6.7, and the related output, as noted earlier, is presented in Table 6.8. Of major importance with this input file is the absence of trait factors. Thus, whereas the method factors constituted Factors 5 to 8 for Model lA, they have now been reparameterized as Factors 1 to 4. Relatedly, specification of factor correlations in the Phi matrix has been altered accordingly. A review of the parameter specification summary in Table 6.8 provides a clear picture of the pattern of estimated parameters in Model 2. (Because specification of the error variances remains unchanged across nested MTMM models, this portion of the output is not included). As indicated by the following goodnessof-fit statistics, the fit for this model was extremely poor (e.g.,x2(99) = 375.25; GFI = .79; CFI - .80}.
Table 6.7 LlSREl Input for MTMH Model 2 (No Traits/Correlated methods) MO NX=16 NK=4 LX=FU,FI PH=SY,FI TO=OI,FR LK SElFRATG TCHRATG PARRATG PEERRATG FR LX(l,l) LX(2,l) LX(3,l) LX(4,l) FR LX(S,2) LX(6,2) LX(7,2) LX(B,2) FR LX(9,3) LX(10,3) LX(ll,3) LX(12,3) FR LX(13,4) LX(14,4) LX(15,4) LX(16,4) FR PH(2,l) PH(3,l) PH(4,3) PH(3,2) PH(4,2) PH(4,3) VA 1.00 PH(l,l) PH(2,2) PH(3,3) PH(4,4) ST .5 LX(l,l) LX(2,l) LX(3,l) LX(4,l) ST .S LX(S,2) LX(6,2) LX(7,2) LX(B,2) 5T .S LX(9,3) LX(10,3) LX(11,3) LX(12,3) 5T .5 LX(13,4) LX(14,4) LX(15,4) LX(16,4) ST .5 PH(2,l) PH(3,l) PH(4,3) PH(3,2) PH(4,2) PH(4,3) ST .5 TO(l,l) TO(2,2) TO(3,3) TO(4,4) TO(5,5) TO(6,6) TO(7,7> ST .5 TO(B,8) au NS AO=OFF
Model 3: Perfectly Correlated Traits/Freely Correlated Methods Input specifications and output results for Model 3 are illustrated in Tables 6.9 and 6.10, respectively. As with Model lA, the hypothesized model, each observed variable loads on both a trait and a method factor. In contrast, however, correlations among the trait factors are fixed to 1.0, as indicated by the VA statement in the input file (see Table 6.9). Relatedly, in the parameter specifications presented in Table 6.10, you will note that no nonzero numbers are evident in the off-diagonal trait portion of the matrix
208
Chapter 6
Table 6.8 L1SREL OutPUt: Parameter Specifications and Goodness-of-fit Statistics for MTMM Model 2 PARAMETER SPECIFICATIONS· LAMBDA-X
SCSELF ACSELF ECSELF MCSELF SCTCH ACTCH ECTCH MCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER MCPEER
SELFRATG
TCHRATG
PARRATG
PEERRATG
1 2
4 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 5 6 7 8 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 9 10 11 12 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
SELFRATG
TCHRATG
PARRATG
PEERRATG
0 17 18 0
0 19 20
0 21
0
3
13
14 15 16
PHI
SELFRATG TCHRATG PARRATG PEERRATG
GOOOIiESS OF FIT STATISTICS CHI-SQUARE IIITH 99 DEGREES OF FREEDOM = 375_25 (P = 0_0) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 276_25 ROOT MEAN SQUARE ERROR OF APPROXIMATION '(RMSEA) = 0.12 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 2.34 STANDARDIZED RMR = 0.15 GOODNESS OF FIT INDEX (GFI) = 0.79 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.72 COMPARATIVE FIT INDEX (CFI) = 0.80 CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P-VALUE FOR CHI-SQUARE aBecause the Theta Del ta parameters remain unchanged, they are not included here
(i.e., involving Factors 1-4), thereby indicating that these parameters are not to be estimated. Viewing the goodness-of-fit results in Table 6.10, we see that the fit of this model (e.g., -l(83)'" 207.91; GFI .... 88; CFI = .91) is marginally good, albeit substantially less well fitting than was the case for Model lA. Furthermore, as was the case for Model 2, the program was unable to compute confidence intervals.
Application 4
209
Table 6.9 L1SREL Input for MTMM Model 3 (Perfectly Correlated Traits/Freely Correlated Methods) MO NX=16 NK=B LX=FU,FI PH=SY,FI TD=DI,FR LK SOCCOMP ACADCOMP ENGCOMP MATHCOMP SELFRATG TCHRATG PARRATG PEERRATG FR LX(l,l) LX(S,l) LX(9,1> LX(13,l) FR LX(2,2) LX(6,2) LX(10,2) LX(14,2) FR LX(3,3) LX(7,3) LX(11,3) LX(lS,3) FR LX(4,4) LX(B,4) LX(12,4) LX(16,4) FR LX(1,S) LX(2,S) LX(3,S) LX(4,S) FR LX(S,6) LX(6,6) LX(7,6) LX(B,6) FR LX(9,7) LX(10,7> LX(ll,7> LX(12,7> FR LX(13,B) LX(14,B) LX(lS,B) LX(16,B) FI PH(2,l) PH(3,l) PH(4,l) PH(3,2) PH(4,2) PH(4,3) FR PH(6,S) PH(7,S) PH(B,S) PH(7,6) PH(B,6) PH(8,7> VA 1.00 PH(l,l) PH(2,2) PH(3,3) PH(4,4) PH(S,S) PH(6,6) PH(7,7> VA 1.00 PH(B,8) VA 1.00 PH(2,l) PH(3,l) PH(4,l) PH(3,2) PH(4,2) PH(4,3) ST .9 LX(14,B) ST.S LX(l,1> LX(9,l) LX(3,3) LX(11,3) LX(4,4) LX(8,4) LX(12,4) ST .S LX(2,S) LX(6,6) LX(7,6) LX(B,6) LX(10,7> LX(12,7> LX(13,B) ST .S LX(lS,8) LX(16,B) ST .3 LX(S,l) LX(13,l) LX(10,2) LX(lS,3) LX(16,4) LX(4,S) LX(9,7) ST .3 Llt( 10, 7> ST .1 LX(2,2) LX(6,2) LX(14,2) LX(7,3) LX(l,S) LX(3,S) LX(S,6) ST .S PH(7,S) PH(7,6) PH(B,6) ST .3 PH(6,S) PH(B,S) ST .1 PH(B,7> ST .B TD( 13, 13) ST .S TD(lS,lS) TD(16,16) ST .3 TD(S,S) TD(B,B) TD(9,9) ST .1 TD(1,l) TD(3,3) TD(4,4) TD(7,7> TO(10,10) TO(ll,ll) TO(12,12) ST .1 TO(14,14) ST .06 TO(6,6) EQ'TD(6,6) TD(2,2) OU NS AD=OFF
Model 4: Freely Correlated Traits/Uncorrelated Methods As is evident from a review of the input file in Table 6.11, the only difference between this model and Model lA is the absence of specified correlations among the method factors. s Again, this specification is clearly visible in the parameter specification summary shown in Table 6.12, where no nonzero entries are evident for the off-diagonal elements involving the method factors. Goodness-of-fit results for this model reveal a well-fitting model (e.g., X2(83) - 110.53; GFI - .93; CFI - .98). Nonetheless, Model 4 remains slightly less well fitting than Model lA.
5Alternatively, we could have specified a model in which the method factors were constrained to 1.00, indicating their perfect correlation. Although both specifications provide the same yardstick by which to determine discriminant validity, the interpretation of results must necessarily be altered accordingly.
Table 6.10 LI SREl outl1!!t: Parameter Sll!lcifications and Goodness-of-fit Statistics for MTMM Model 3 8
PARAMETER SPECIFlCATIOIS LAMBDA-X SOCCOMP
ACADCOMP
ENGCOMP
MATHCOMP
SElFRATG
TCHRATG
1 0 0 0 9 0 0 0 17 0 0 0 25 0 0 0
0 3 0 0 0 11 0 0 0 19 0 0 0 27 0 0
0 0 5 0 0 0
2 4 6
0 0 0 21 0 0 0 29 0
0 0 0 7 0 0 0 15 0 0 0 23 0 0 0 31
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 10 12 14 16 0 0 0 0 0 0 0 0
PARRATG
PEERRATG
0
0 0 0 0
MATHCOMP
SELFRATG
TCHRATG
0 33 34 36
35 37
SCSELF ACSElF ECSElF MCSElF SCTCH ACTCH ECTCH MCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER MCPEER
13
8
LAMBOA-X
SCSELF ACSELF ECSELF MCSELf SCTCH ACTCH ECTCH MCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER MCPEER
0 0
0 0 0 0 0 18 20 22 24
0
0 0
0 0
0 0 0 0 0 26 28 30 32
SOCCOMP
ACADCOMP
ENGCOMP
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
PARRATG
PEERRATG
0 38
0
0 0
PHI
SOCCOMP ACADCOMP ENGCOMP MAT HCOMP SElfRATG TCHRATG PARRATG PEERRATG PHI
PARRATG PEERRATG
210
0 0
0
0 0
0
Table 6.10 cont. GOCDIIESS OF FIT STATI STI CS CHI-SQUARE WITH 8J DEGREES OF FREEDOM = 207.91 (P = 0.00) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 124.91 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.089 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 1.63 STANDARDIZED RMR = 0.071 GOODNESS OF FIT INDEX (GFI) = 0.88 ADJUSTED GOODNESS OF FIT INDEX (AGF I) = 0.80 COMPARATIVE FIT INDEX (CFI)
=
0.91
CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P'VALUE FOR CHI • SQUARE aBecause the Theta Delta parameters remain unchanged, they are not included here
Table 6.11 L1SREL I!J)!Jt for MTIII Model 4 (Correlated Traits/Uncorrelated Methods)
Me NX=16 NIC=8 LX=FU,FI PH=SY,FI TD=DI,FR LIC SOCCOMP ACADCOMP ENGCOMP MATHCOMP SELFRATG TCHRATG PARRATG PEERRATG FR LX(I,1> LX(5,1) LX(9,1) LX(13,1) FR LX(2,2) LX(6,2) LX(10,2) LX(14,2) FR LX(3,3) LX(7,3) LX(I',3) LX(IS,3) FR LX(4,4) LX(8,4) LX(12,4) LX(16,4) FR LX(I,5) LX(2,5) LX(3,5) LX(4,5) FR LX(5,6) LX(6,6) LX LX(10,7> LX(I',7> LX(12,7> FR LX(13,8) LX(14,B) LX(15,B) LX(16,B) FR PH(2,I) PH(3,1) PH(4,1> PH(3,2) PH(4,2) PH(4,3) VA 1.00 PH(I,I) PH(2,2) PH(3,3) PH(4,4) PH(S,S) PH(6,6) PH(7,7> VA 1.00 PH(B,B) ST .9 LX(14,B) ST . 5 LX(I,1> LX(9,1> lX(3,3) LX(I',3) lX(4,4) LX(B,4) LX(12,4) ST .5 lX(2,S) LX(6,6) LX(7,6) lX(B,6) lX(IO,7> lX(12,7> lX(13,B) ST .S LX(15,B) LX(16,8) ST .3 lX(5,1> LX(13,1) lX(10,2) lX(15,3) LX(16,4) lX(4,5) lX(9,7> ST .3 LX( 10, 7> ST .1 LX(2,2) LX(6,2) LX(14,2) LX(7,3) LX(I,S) LX(3,5) LX(5,6) ST .B PH(3,2) PH(4,2) ST .3 PH(2,1) PH(4,3) ST .1 PH(3,1) PH(4,1) ST .B TD( 13, 13) ST .5 TD(15,15) TD(16,16) ST .3 TD(5,5) TD(B,B) TD(9,9) ST .1 TO(I,1> TD(3,3) TD(4,4) TD TD(10,10) TO(I',I1) TD(12,12) ST .1 TO(14,14) ST .06 TO(6,6) EQ TO(6,6) TO(2,2) OU NS AD=OFF
211
Table 6.12 LI SREL outll!!t: Parameter Sl!!lcifications and Goodness-of-fit Statistics for MTMM Model 4
PARAMETER SPECIFICATIONS LAMBDA-X SOCCOMP
ACADCOMP
ENGCOMP
MAT HCOMP
SELFRATG
TCHRATG
1 0 0 0 9 0 0 0 17 0 0 0 25 0 0 0
0 3 0 0 0 11 0 0 0 19 0 0 0 27 0 0
0 0 5 0 0 0
2 4 6 8
0 0 0 21 0 0 0 29 0
0 0 0 7 0 0 0 15 0 0 0 23 0 0 0 31
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 10 12 14 16 0 0 0 0 0 0 0 0
PARRATG
PEERRATG
0 0 0
0 0
0 0 0 0 0 0 0 0 0 0 0 0 26 28 30 32
SOCCOMP
ACADCOMP
ENGCOMP
MATHCOMP
SELFRATG
TCHRATG
0 33 34 36 0 0 0 0
0 35 37 0 0 0 0
0 38 0 0 0 0
0 0 0 0 0
0 0 0 0
0 0 0
PARRATG
PEERRATG
0 0
0
SCSELF ACSELF ECSELF MCSELF SCTCH ACTCH ECTCH MCTCH SCPAR ACPAR ECPAR MCPAR SCPEER ACPEER ECPEER HCPEER
13
0
LAMBDA-X
SCSELF ACSELF ECSELF MCSELF SCTCH ACTCH ECTCH HCTCH SCPAR ACPAR ECPAR HCPAR SCPEER ACPEER ECPEER MCPEER
0
0 0 0 0 18 20 22 24 0
0
PHI
SOCCOMP ACADCOMP ENGCOMP MAT HCOMP SELFRATG TCHRATG PARRATG PEERRATG PHI
PARRATG PEERRATG
212
Application 4
213
Table 6.12 cont.
OOIIlNESS OF FIT STATISTICS
CHI-SQUARE IIITH 83 DEGREES OF FREEDOM = 110.53 (P = 0.023) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 27.53 90 PERCENT CONFIDENCE INTERVAL FOR NCP = (4.17 ; 58.97) ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.042 90 PERCENT CONFIDENCE INTERVAL FOR RMSEA = (0.016 ; 0.061) P-VALUE FOR TEST OF CLOSE FIT (RMSEA < 0.05) = 0.74 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 1.13 90 PERCENT CONFIDENCE INTERVAL FOR ECVI = (1.01 ; 1.29) STANDARDIZED RMR = 0.072 GOODNESS OF FIT 1NDEX (GF I) = 0.93 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.89 COMPARATIVE FIT INDEX (CFI) = 0.98
TESTING FOR CONSTRUCT VALIDITY: COMPARISON OF MODELS Now that we have examined goodness-of-fit results for each of our MTMM models, we can turn to the task of determining evidence of construct validity. In this section, we ascertain information at the matrix level only, by comparing particular pairs of models. A summary of fit related to all four MTMM models is presented in Table 6.13, and results of model comparisons are summarized in Table 6.14. Although results related to the Correlated Uniqueness model (Model 5) are also included for sake of completeness, they are discussed subsequent to our comparison of the nested general CFA models.
Evidence of Convergent Validity As noted earlier, one criterion of construct validity bears on the issue of convergent validity, the extent to which independent measures of the same trait are correlated (e.g., teacher and self-ratings of social competence); these values should be substantial and statistically significant (Campbell & Fiske, 1959). Using Widaman's (1985) paradigm, evidence of convergent validity can be tested by comparing a model in which traits are specified (ModellA), with one in which they are not (Model 2), the difference in between the two models (~X~ providing the basis for judgement; a significant difference in X2 supports evidence of convergent validity. As shown in Table 6.14, the 2 ~X2 was highly significant (X (22) - 297.82, P < .001), and the difference in practical fit (~GFI = .16; ~CFI - .20) substantial, thereby arguing for the 6 tenability of this criterion.
"I:
Chapter 6
214
Table 6.13 Summary of Goodness-of-fit Indices for MTMM Models Model 1A Freely correlated traits; Freely correlated methods
X2
df
CFI
GFI
RMSEA
ECVI
77.43
77
1.00
.95
.01
1. 02
2
No traits; Freely correlated methods
375.25
99
.80
.79
.12
2.34
3
Perfectly correlated traits; Freely correlated methods
207.91
83
.91
.88
.09
1. 63
4
Freely correlated traits; Uncorrelated methods
110.53
83
.98
.93
.04
1.13
5
Freely correlated traits; Freely correlated uniquenesses
96.47
74
.98
.94
.04
1.15
Table 6.14 Differential Goodness-of-fit Indices for MTMM Nested Model Comparisons Difference in Model Comparisons
X2
df
CFI
GFI
297.82
22
.20
.16
130.48 33.10
6 6
.08 .02
.07 .02
Test of convergent Validity Model 1 vs Model 2 (traits) Test of Discriminant Validity Model 1 vs Model 3 (traits) Model 1 vs Model 4 (methods)
Evidence of Discriminant Validity Discriminant valid~ty is typically assessed in terms of both traits and methods. In testing for evidence of trait discriminant validity, one is interested in the extent to which independent measures of different traits are correlated; 6Although there is no firm standard for assessing the difference in indices of practical fit, such differentials have been used heuristically in order to provide more realistic indications of nested model comparisons than those based on the'/ statistic (see, e.g., Bagozzi & Yi, 1990; Widaman, 1985); it is in this sense that they are used here.
Application 4
215
these values should be negligible. When the independent measures represent different methods, correlations bear on the discriminant validity of traits; when they represent the same method, correlations bear on the presence of method effects, another aspect of discriminant validity. In testing for evidence of discriminant validity among traits, we compare a model in which traits correlate freely (ModellA), with one in which they are perfectly correlated (Model 3); the larger the discrepancy between the 'l and the GFI/CFI values, the stronger the support for evidence of discriminant validity. This comparison yielded a il'l value that was statistically 2 significant (X (6) = 130.48, p< .001), and the difference in practical fit was fairly large (ilGFI - .07; ilCFI - .08), thereby suggesting only modest 7 evidence of discriminant validity at best. Based on the same logic, albeit in reverse, evidence of discriminant validity related to method effects was tested by comparing a model in which method factors were freely correlated (Model lA) with one in which the method factors were specified as uncorrelated (Model 4); in this case, a significant ilX2 (or large ilCFI) argues for the lack of discriminant validity and thus, for common method bias across methods of measurement.8 On the strength of both statistical (ilX2(6) '" 33.10) and nonstatistical (ilGFI - .02; ilCFI = .02) criteria, as shown in Table 6.14, it seems reasonable to conclude that evidence of discriminant validity for the methods was substantially stronger than it was for the traits.
TESTING FOR CONSTRUCT VALIDITY: EXAMINATION OF PARAMETERS A more precise assessment of trait- and method-related variance can be ascertained by examining individual parameter estimates. Specifically, the factor loadings and factor correlations of the hypothesized model (Model lA) provide the focus here. Because it is difficult to envision the MTMM pattern of factor loadings and correlations from the printout when more than
7Alternatively, we could com~are ModellA with one in which trait factors are specified as uncorrelated. As such, a large!:t.x would argue against evidence of discriminant validity. 8As was noted for the traits (Footnote 6), we could alternatively specify a model in which perfectly correlated method factors are specified; as such, a minimal !:t.X2 would argue against evidence of discriminant validity.
216
Chapter 6
six factors are involved, these values have been tabled to facilitate the assessment of convergent and discriminant validity; factor loadings are summarized in Table 6.15 and factor correlations in Table 6.16. (For a more extensive discussion of these MTMM findings, see Byrne & Bazana, 1996.)
Evidence of Convergent Validity In examining individual parameters, convergent validity is reflected in the magnitude of the trait loadings. As indicated in Table 6.15, all but one trait loading was significant. However, in a comparison of factor loadings across traits and methods, we see that the proportion of method variance exceeds that of trait variance for all but one of the teacher ratings (Social Competence), and all of the peer ratings. 9 Thus, although at first blush evidence of convergent validity appeared to be fairly good at the matrix level, more in-depth examination at the individual parameter level reveals the attenuation of traits by method effects related to teacher and peer ratings, thereby tempering evidence of convergent validity (see also Byrne & Goffin, 1993, with respect to adolescents).
Evidence of Discriminant Validity Discriminant validity bearing on particular traits and methods is determined by examining the factor correlation matrices. Although, conceptually, correlations among traits should be negligible in order to satisfy evidence of discriminant validity, such findings are highly unlikely in general and with respect to psychological data in particular. Although these findings, as shown in Table 6.16, suggest that relations between academic competence and competencies related to the specific subject areas of English and mathematics are most detrimental to the attainment of trait-discriminant validity, they are nonetheless consistent with construct validity research in this area as it relates to late preadolescent children (see Byrne & Worth Gavin, 1996). Finally, an examination of method factor correlations in Table 6.16 reflects on their discriminability, and thus on the extent to which the methods are maximally dissimilar; this factor is an important underlying assumption of the MTMM strategy (see Campbell & Fiske, 1959). Given the
9Trait and method variance, within the context of the general CFA MTMM model, equals the factor loading squared.
Table 6.15 Trait and Method Loadings for MTMM Model lA SC
AC
EC
MC
SR
.58
.15 .39 .17 .38
TR
PAR
PER
Self Ratings (SR) Social Competence .58 Academic Competence English Competence Mathematics Competence
.39
.43
Teacher Ratings (TR) Social Competence .28 Academic Competence English competence Mathematics competence
.14 a
.21
.20 .70 .58 .52
.36
Parent Ratings (PAR) Social Competence .49 Academic Competence English Competence Mathematics Competence
.38
.47
.29 .37 .29 .38
.71
Peer Ratings (PER) Social Competence .36 Academic Competence English Competence Mathematics Competence
.24
.24
.46 .93 .69 .71
.32
a Not statistically significant
Table 6.16 Trait and Method Correlations for MTMM Model lA Traits Measures Social Competence (SC)
SC
AC
EC
Methods MC
SR
TR
PAR
PER
1. 00
Academic Competence (AC)
.28
English competence (EC)
.14 a
.77
1. 00
Mathematics Competence (MC) .20 a
.77
.41
Self Ratings (SR)
1.00
1.00 1.00
Teacher Ratings (TR)
.48
1. 00
Parent Ratings (PAR)
.20
.68
Peer Ratings (PER)
.37
.42
a Not statistically significant
1.00 .22 a
1. 00
217
218
Chapter 6
obvious dissimilarity of self-, teacher, parent, and peer ratings, it is rather startling to find a correlation of .68 between teacher and parent ratings of competence. One possible explanation of this finding is that, except for minor editorial changes necessary in tailoring the instrument to either teacher or parent as respondents, the substantive content of all comparable items in the teacher and parent rating scales were identically worded; the rationale here being to maximize responses by different raters of the same student.
THE CORRELATED UNIQUENESS MTMM MODEL As noted earlier, the Correlated Uniqueness model represents a special case of the general CFA model. Building on the early ~ork of Kenny (1976, 1979), Marsh (1988, 1989) proposed this alternative MTMM model in answer to the numerous estimation and convergency problems encountered with analyses of general CFA models and, in particular, the Correlated Traits/Correlated Methods version (Modell in this application and typically the hypothesized model). A schematic representation of the Correlated Uniqueness model is shown in Fig. 6.2. In reviewing the model depicted in Fig. 6.2, you will note that it embodies only the four correlated trait factors; in this aspect, it is consistent with the model shown in Fig. 6.1. The notably different feature about the Correlated Uniqueness model, however, is that although no method factors are specified per se, their effects are implied from the specification of correlated error terms (the uniquenesses)10 associated with each set of observed variables embracing the same method. For example, as indicated in Fig. 6.2, all error terms associated with the Self-Rating measures of Social Competence are intercorrelated; likewise, those associated with teacher, parent, and peer ratings are also intercorrelated. Consistent with the Correlated Traits/Uncorrelated Methods model (Model 4 in this application), the Correlated Uniqueness model assumes that effects associated with one method are uncorrelated with those associated with the other methods (Marsh & Grayson, 1995). However, one critically important difference between the Correlated Uniqueness model and both the Correlated Traits/Correlated methods (Modell) and Correlated
10 As noted in chapter 3, the term uniqueness is used in the factor-analytic sense to mean a composite of random measurement error and specific measurement error associated with a particular measuring instrument.
SCSELF XI
ACSELF
, Au
SOCIAL COMPETENCE
ASI
~,
~ I
,A 13.;
021
01
x, ECSELF X,
MCSELF
02
0"
032
0" 0"
o. 30"
X. 4>21
SCTCH
O.
X, A22
4>31
ACTCH
ACADEMIC ,A62 COMPETENCE ~,
A!O.2
1. 1•.2
0"
Os
0"
X6
ECTCH
0"
X,
MCTCH
0.,
0
06 " 0, 0"
X. 4>.1
4>32
SCPAR
x. A"
4>.2
ACPAR
An
ENGLISH COMPETENCE
X,.
~U3
1:"
!- IS.3
ECPAR XU
MC.t'AR
O. 0". 09
011.
O l .~
011.9
010
012.11 0"
012.10
X I2 4>.3
SCPEER XU A44 As
FR LX(l,s) LX(2,s) LX(3,6) FR PS(l,l) PS(2,2) PS(4,4) PS(s,s) FR PH(l,l) PH(2,2) PH(3,3) PH(4,4) PH(s,s) PH(6,6) PH(7,7> FR PH(2,l) PH(3,l) PH(3,2) PH(6,s) PH(7,s) PH(7,6) FR GA(l,s) GA(l,6) GA(l,7> GA(2,s) GA(3,2) GA(3,3) GA(3,4) GA(4,2) GA(s,l) FR BE(4,3) BE(s,l) BE(s,2) BE(s,3) BE(s,4) VA 1.0 LY(l,1) LY(4,2) LY(9,3) LY(12,4) LY(14,s) ST .8 LY(2,l) LY(3,l) Ly(s,2) LY(6,2) LY(7,2) LY(8,2) LY(ls,s) ST .9 LY(11,3) LY(13,4) LY(16,s) ST 1.0 LY(10,3) VA 1.0 LX(2,1) LX(4,2) LX(6,3) LX(10,4) LX(ll,s) LX(13,6) LX(ls,7> ST .5 LX(3,2) ST .7 LX(7,4) LX(8,4) LX(9,4) ST 1.0 LX(s,3) ST .8 LX(l,l) LX(12,s) LX(14,6) LX(16,7> ST '.2 LX(3,6) ST -.4 LX(l,s) ST -.5 LX(2,s) ST 1.0 PH(2,2) PH(6,6) ST .5 PH(l,l) PH(3,3) PH(4,4) PH(s,s) PH(7,7> ST .3 PH(2,l) PH(3,l) PH(3,2) ST .5 PH(6,s) PH(7,s) PH(7,6) ST -.07 GA(s,l) GA(l,6) ST -.2 GA(2,s) ST -3 GA(3,2) ST -.5 GA(3,4) ST .07 GA(l, 7> GA(4,2) ST .2 GA(l,s) ST 5 GA(3,3) ST -.07 BE(s,3) ST -.1 BE(s,2) BE(s,4) ST .3 BE(s,l) BE(4,3) ST.l PS(l,1) PS(2,2) ST .3 PS(4,4) PS(s,s) VA .001 PS(3,3) ST .2 TD(l,l) TD(2,2) TD(7,7> TD(B,8) TD(9,9) TD(10,10) TD(13,13) ST .2 TD(14,14) TD(ls,ls) TD(16,16) ST .6 TD(3,3) TD(4,4) TD(s,s) TD(6,6) TD(ll,ll) TD(12,12) ST .2 TE(l,l) TE(2,2) TE(3,3) TE(4,4) TE(s,s) TE(6,6) TE(7,7> TE(B,B) TE(9,9) ST .2 TE(10,10) TE(ll,ll) TE(12,12) TE(13,13) TE(14,14) TE(ls,ls) TE(16,16) OU NS
240
Application 5
241
analyses are to be based on the covariance matrix. Following the indicator variable labels, we see again that the data are in the form of a raw matrix that resides external to the input file, the name of which is listed on the line following (RA= ELIND1L.DAT FO). Second, the SELECTION keyword is included because, as noted earlier, the variables need to be reordered such that the entry of dependent variables precedes that of independent variables. Thus, the observed variables measuring the dependent factors in the model (variables 17-32) are listed first, followed by those measuring the independent variables (1-16). Likewise, variable labels for the dependent factors (111-115; SELF-PA) are presented first, followed by those for the independent factors (~1-~7; ROLEA-PSUPPORT). Third, the MO line, of course, summarizes the decomposition of the model as described earlier such that there are 16 X-variables, 16 Y-variables, 5 11-factors, and 7 S-factors. Furthermore, all matrices are specified as Full and Fixed, with individual parameters related to each being specified as free in the lines that follow. Relatedly, you will note the input of three VA lines. The first two relate to a constrained value of 1.00 for the first indicator variable of each construct in the X and Y measurement models, respectively. The third VA line indicates that PS(3,3} (~3)' the residual term associated with the prediction of Emotional Exhaustion (113), has been fixed to a value of .001. This specification was determined in the original test of the hypothesized model using the EQS program (see Byrne, 1994a); as such, this parameter was constrained to lower bound by the program. 3 Finally, you will note that multiple individualized start values have been specified; these were taken from EQS estimates related to the original study. Because these values were included the keyword NS was then added to the OU line, indicating that start values provided by the program should be ignored.
lISREL Output for Hypothesized Model Selected goodness-of-fit statistics and modification indices related to the hypothesized model are presented in Tables 7.5 and 7.6, respectively. Turning first to the goodness-of-fit statistics, we see that, given the value and small probability value, confidence limits could not be computed as noted on the output. Given the known sensitivity of this statistic to sample size,
-l
3When this parameter was left unconstrained using LISREL 8, serious problems of nonconvergence resulted.
242
Chapter 7
Table 7.5 Goociness-of-Fit Statistics for Hypothesized Hodel of Burnout GOCIlIESS OF FIT STATISTICS CHI-SQUARE WITH 442 DEGREES OF FREEDaI = 1305.02 (P = 0.0) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 863.02 ROOT MEAN SQUARE ERROR Of APPROXIMATION (RHSEA) EXPECTED CROSS-VALIDATION INDEX (ECVI)
=
0.057
= 2.47
STANDARDIZED RMR = 0.17 GOODNESS OF FIT INDEX (GFI) = 0.88 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.86 CalPARATlVE FIT INDEX (CFt) = 0.91 CONFIDENCE LIMITS COULD NOT BE CalPUTED DUE TO TOO SMALL P-VALUE FOR CHI-SQUARE
however, use of the ·l index provides little guidance in determining the extent to which the model does not fit. Thus, it is more beneficial to rely on fit as represented by the CFr, which, in this instance, provides evidence of a fairly well-fitting model (CFr - .91). Note also that the RMSEA value is well within the recommended range of acceptability ( < .05 to .08). (For a review of these rule-of-thumb guidelines, you may wish to consult chapter 3, where goodness-of-fit indices are described in more detail.) Over and above the fit of the model as a whole, however, a review of the modification indices reveals some evidence of misfit in the model. Because we are only interested in the causal paths of the model at this point, only indices related to the Beta and Gamma matrices are shown in Table 7.6. Turning to this table, note that the maximum modification index is associated with Gamma {2,2}, which represents a path flowing from Role Conflict to External Locus of Control. The value of 47.07 indicates that, if this parameter were to be freely estimated in a subsequent model, the overall ·l value would drop by at least this amount. If you turn now to the Expected Parameter Change statistics related to the Gamma matrix, you will find a value of 0.12 for this ROLEC ~ XLOCUS parameter; this value represents the approximate value that the newly estimated parameter will assume. From a substantive perspective, it would seem perfectly reasonable that elementary teachers who experience strong feelings of role conflict may ultimately exhibit high levels of external locus of control. Given the meaningfulness of this influencial flow, therefore, the model was reestimated with GA{2,2} specified as a free parameter {i.e., Model 2}. Results related to this respecified model are discussed within the framework of Post Hoc Analyses.
Table 7.6 Modification Indices for Structural Parameters IIIlI FlCAT". IIIUCES FOR BETA SELF SELF XLOCUS EE DP PA
30.38 32.79 14.53
XLOCUS
EE
DP
PA
26.81
39.27 19.67
44.04 15.58 5.37
23.34 17.53 10.32 16.74
XLDCUS
EE
DP
PA
-0.23
-0.08 0.07
-0.12 0.08 -0.50
0.21 -0.20 -0.47 -0.98
DECISION
SSUPPORT
6.52 7.31 1.76 2.22
6.34 2.21
EXPECTED CIlAllGE FOR BETA SELF SELF XLOCUS EE DP PA
-0.31 -0.69 -0.40
--------
0.27 0.14
IIIlIFICATlOII IIIlICES FOR GMIIA
SELF XLOCUS EE DP PA
ROLEA
ROLEC
WORK
CLIMATE
18.45 33.61 2.33 2.53
23.91 47_07
31.25 45.53
4.66 3.13
1.09
31.55 1.71
31.53 6.10
15.74 2.30 4.57
DECISION
SSUPPORT
-0.22 -0.07 0.10
0.15 -0.11 -0.04 0.04
IIIlIFICATlOII IIIlICES FOR GMIIA PSUPPORT SELF XLOCUS EE DP PA
4.66 17.54 7.94 6.60 EXPECTED CIlAllGE FOR GMIIA
SELF XLOCUS EE DP PA
ROLEA
ROLEC
WORK
CLIMATE
-0.13 0.20 1.03 0.21
-0.07 0.12
-0.11 0.16
0.07 -0.07
0.06
-5.76 0.12
-0.48 0.16
EXPECTED CIlAllGE FOR GMIIA PSUPPORT SELF XLOCUS EE DP PA
-0.07 -0.24 -0.14 0.11
MAXIMUM MODIFICATION INDEX IS
47.07 FOR ELEMENT ( 2, 2) OF GAMMA
243
244
Chapter 7
POST HOC ANALYSES The estimation of Model 2 yielded an overall X2(#I) value of 1237.98 and a CFI value of .92; moreover, the parameter value was slightly higher than the one predicted by the Expected Parameter Change statistic (0.16 vs. 0.12) and it was statistically significant (t - 8.57). Modification indices related to the structural parameters for Model 2 are shown in Table 7.7. In reviewing the statistics presented in Table 7.7, we see that the largest modification index for this model was BE(1,4}, the omitted path from OP to SELF. These results suggest that if this path were to be specified in the model, depersonalization would have a substantial impact on self-concept. However, as emphasized in chapter 3, parameters identified by LISREL as belonging in a model are based on statistical criteria only; of more import is that their inclusion be substantively meaningful. Within the context of the original study, the incorporation of this path into the model would make no sense whatsoever because its primary purpose was to validate the impact of organizational and personality variables on burnout, and not the reverse. Thus, we ignore this suggested model modification and turn, instead, to the next largest index (36.83), which represents the path from WORK to SELF. Because the estimation of this parameter is substantively meaningful, and the sign of the Expected Change statistic (-.12) is reasonable, Model 2 was respecified to include the estimation of GA(1,3) (WORK ~ SELF; Model 3). Results from the estimation of Model 3 yielded a X2(440) value of 1184.65, the CFI value was .93, and the estimated parameter value (-.17) was also statistically significant (t - -7.69). Modification indices related to this model are presented in Table 7.8. Although the program noted that the largest modification index was associated with a correlated error in the Theta Epsilon matrix, for our purposes here we focus only on the structural parameters. Accordingly, we find two indices that have approximately the same values-30.70 representing the path from WORK to OP, and 30.68 representing the path from CLIMATE to OP. In reviewing the larger of the two (WORK ~ OP), we see that the Expected Change statistic has the wrong sign. That is to say, from a substantively meaningful perspective, we would expect that high levels of work overload would generate high levels of depersonalization thereby yielding a positive Expected Change statistic value; the reverse, however, was found to be the case (-5.32). (Recall Bentler's 1995 caveat, noted in chapter
Table 7.7 Modification Indices for Structural Parameters of Model 2 lO>lFICATIOI 1111 I CES Fill BETA SELF SELF XLOCUS EE DP PA
15.99 31.59 14.23
XLOCUS
EE
DP
PA
31.71
41.83 0.97
45.61 2.10 4.97
22.75 4.73 9.87 17.65
XLOCUS
EE
DP
PA
-0.24
-0.08 0.02
-0.12 0.03 -0.49
0.20 -0.12 -0.48 -1.06
DECISION
SSUPPORT 3.10 7.22 1.68 2.26
5.55 2.13
EXPECTED CHAllGE Fill BETA SELF SELF XLOCUS EE DP PA
-0.21 -0.68 -0.39
0.31 0.16
10>1 FI CATIOI 1111 I CES Fill GAIIIA
SELF XLOCUS EE DP PA
ROLEA
ROLEC
IKlRK
CLIMATE
22.n
29.42
36.53 0.99
5.24 0.02
31.42 2.18
31.39 6.16
15.10 1.95 4.42
DECISION
SSUPPORT
-0.22 -0.07 0.09
0.10 -0.10 -0.04 0.04
0.00 1.91 2.49
1.42
IDlIFlCATIOI IIIIICES Fill GAIIIA PSUPPORT SELF XLOCUS EE DP PA
3.99 17.06 7.76 6.63 EXPECTED CHAllGE Fill GAIIIA
SELF XLOCUS EE DP PA
ROLEA
ROLEC
IKlRK
CLIMATE
-0.14 0.00 0.93 0.21
-0.08
-0.12 0.11
0.08 0.00
-5.81 0.14
-0.47 0.16
0.07
EXPECTED CHAllGE Fill GAIIIA PSUPPORT SELF XLOCUS EE DP PA
-0.06 -0.23 -0.13 0.11
MAXIMUM MODIFICATION INDEX IS
45.61 FOR ELEMENT ( "
4) OF BETA
245
Table 7.8 Modification Indices for Structural Parameters of Model 3 IIlDlflCATlIII I III ICES FOR BETA SELF SelF XLOCUS EE DP PA
13.89 17.48 15.12
XLOCUS
EE
DP
PA
8.57
8.14 0.83
18.80 1.71 5.65
8.04 3.58 5.32 19.20
XLOCUS
EE
DP
PA
·0.14
'0.06 0.02
'0.09 0.03 '0.48
0.17 ·0.11 '0.36 ·1.30
IKlRK
CLIMATE
DECISION
SSUPPORT
0.88
0.68 0.00
30.70 2.80
30.68 6.27
13.29 1.46 4.45
2.09 6.47 1.42 2.34
IKlRK
CLIMATE
DECISION
SSUPPORT
0.09
0.03 0.00
'5.32 0.18
'0.47 0.16
'0.20 '0.06 0.08
0.08 '0.10 '0.04 0.04
3.95 1.62
EXPECTED CHAllGE FOR BETA SelF SelF XLOCUS EE DP PA
'0.21 '0.61 '0.46
0.27 0.15
IIlDIFICATlIII IIIlICES FOR GNIIA
SelF XLOCUS EE DP PA
ROLEA
ROLEC
0.40 0.03 1.34 2.12
7.66
1.80
IIlD I FI CATlOII I III ICES FOR GNIIA PSUPPORT SelF XLOCUS EE DP PA
4.29 15.43 7.39 6.83 EXPECTED CHAllGE FOR GNIIA
SELF XLOCUS EE DP PA
ROLEA
ROLEC
0.03 '0.01 0.68 0.20
0.18
0.08
EXPECTED CHAllGE FOR GNIIA PSUPPORT SelF XLOCUS EE DP PA
'0.06 ·0.22 '0.13 0.10
MAXIMUM MODIFICATION INDEX IS
246
38.24 FOR elEMENT (10, 9) OF THETA·EPS
Application 5
247
3, that these values can be affected by both the scaling and identification of factors and variables.} Thus, the incorporation of this suggested path was ignored in favor of a model that included the estimation of GA{4,4; CLIMATE ~ DP} as a free parameter {i.e., Model4}. Again, the 'X: difference between Models 3 and 4 was statistically significant (~X2(1) - 36.30}, as was the estimated parameter {-0.55, t - -6.36}; there was no change in the CFI value {.93}. Modification indices related to the estimation of Model 4 are shown in Table 7.9. As you will note, there are no outstanding values suggestive of model misfit. Thus, no further consideration was given to the inclusion of additional parameters. Thus far, discussion related to model fit has considered only the addition of parameters to the model. However, another side to the question of fit, particularly as it pertains to a full model, is the extent to which certain initially hypothesized paths may be irrelevant to the model. One way of determining such irrelevancy is to examine the statistical significance of all structural parameter estimates. This information, as derived from the estimation of Model 4, is presented in Table 7.10. In reviewing the structural parameter estimates for Model 4, we can see five parameters that are nonsignificant; these are: BE{5,2; XLOCUS ~ P A}; GA{1,5; DECISION ~ SELF}; GA{1,6; SSUPPORT ~ SELF}; GA{4,2; ROLEC ~ DP}; and GA{5,1; ROLEA ~ PA}. In the interest of parsimony, then, a final model of burnout was estimated with these five structural paths deleted from the model. Estimation of this final model resulted in an overall :l(4-44) value of 1154.41, with a CFI value of .93. At this point you may have some concern over the slight erosion in model fit from X 2(439) - 1148.35 for Model 4 to X2(4-44) = 1154.41 for ModelS, the final model. However, with deletion of any parameters from a model, such a change is to be expected. The important aspect of this change in model fit is that the X2 difference between the two models is not statistically significant (~X2(5) - 6.06}. A schematic representation of this final structural model of burnout for secondary teachers that includes the standardized path coefficients is displayed in Fig. 7.3" In ~he high standard errors associated with the three paths predicting Emotional Exhaustion, in addition to the inappropriate sign associated with the path flowing from Role Conflict to Emotional Exhaustion, arc clearly a function of the constrained value of .001 placed on the related residual error term. Interestingly, similar findings were not present in the original study. Estimation of the final model with this parameter freely estimated nevertheless resulted in a nonconvergent solution after 255 iterations.
Table 7.9 Modification Indices for Structural Parameters of Model 4 !lll)lflCATUII IIIUCES FOR BETA SELF SELF XLOCUS EE DP PA
13.97 16.74 15.22
XLOCUS
EE
DP
PA
8.72
8.13 0.80
17.24 1.61 2.94
7.70 3.64 10.97 0.25
XLOCUS
EE
DP
PA
-0.14
-0.06 0.02
'0.08 0.03 0.45
0.16 ·0.11 '0.54 '0.08
WORK
CLIMATE
DECISION
SSUPPORT
0.79
1.17 0.03
2.51
4.67
13.07 0.37 4.28
2.11 6.30 0.51 2.23
WORK
CLIMATE
DECISION
SSUPPORT
0.09
0.03 '0.01
0.17
0.15
'0.20 -0.03 0.08
0.08 '0.10 '0.02 0.04
3.75 1.46
EXPECTED CHAllGE FOR BETA SELF SELF XLOCUS EE DP PA
'0.21 '0.60 '0.46
0.26 0.14
!lll)IFICATUII IIIIICES FOR GNIIA
SELF XLOCUS EE DP PA !III) I
ROLEA
ROLEC
0.34 0.02 1.31 1.70
7.26
1.83
fI CAT1111 11111 CES FOR GNIIA
PSUPPORT SELF XLOCUS EE DP PA
4.31 14.99 5.40 6.69 EXPECTED CHAllGE FOR GAIIIA
SELF XLOCUS EE DP PA
ROlEA
ROlEC
0.03 '0.01 0.68 0.18
0.17
0.09
EXPECTED CHAllGE FOR GNIIA PSUPPORT SELF XLOCUS EE DP PA
248
'0.06 '0.22 '0.11 0.10
Table 7.10 Structural Parameter Estimates for Model 4
BETA SELF
XLOCUS
EE
DP
PA
SELF XLOCUS EE 0.40 (0.04) 9.74
DP
PA
0.36 (0.09) 4.03
-0.13 (0.08) -1.58
'0.08 (0.04) '2.16
'0.22 (0.05) '4.71
ROLEA
ROLEC
IIORK
CLIMATE
GAIIIA
DECISION
SSUPPORT
·0.17 (0.02) ·7.64
SELF
XLOCUS
0.17 (0.02) 8.87
EE
-3.14 (0.68) -4.59
·0.11 (0.02) -4.83 5.37 (0.94) 5.72
'0.43 (0.10) -4.45 -0.54 (0.09) '6.20
DP
PA
PSUPPORT SELF
0.10 (0.03) 3.02
XLOCUS EE DP PA
249
o
N 1.11
.11('02)/
5.38 (.94)
External Locus of Control
.17(.02)
-3.15 (.69)
Emotional Exhaustion
(.00)
(.03)
Personal Accomplishment ~ccomplishmen~
.33 (.03) ~
-.23 (.05)
Depersonalization
·.10 (.03)
~42
I
.45 (.06).
FIG. 7.3. Final Model of Causal Structure Related to Burnout for Elementary Teachers. (Coefficients associated with structural paths represent unstandardized estimates; parenthesized values represent standard errors.)
.10 (.01)
-.18 (.02)
-.43 (.10) ~43 (.10)
-.55 (.09)
SelfEsteem
.11 (.02) -.11
.11 (.02),
Peer Peer Support
Decision Making
Classroom Climate
Work Overload
Role Conflict
Application 5
251
addition, to enable you to perhaps gain a clearer picture of the models tested in this application, model specifications and goodness-of-fit statistics are summarized in Table 7.1l. In working with structural equation models, it is very important to know when to stop fitting a model. Although there are no firm rules or regulations to guide this decision, the researcher's best yardsticks include (a) a thorough knowledge of the substantive theory, (b) an adequate assessment of statistical criteria based on information pooled from various indices of fit, and (c) a watchful eye on parsimony. In this regard, the SEM researcher must walk a fine line between incorporating a sufficient number of parameters to yield a model that adequately represents the data, and falling prey to the temptation of incorporating too many parameters in a zealous attempt to attain the best-fitting model, statistically. Two major problems with the latter tack are that (a) the model can comprise parameters which actually contribute only trivially to its structure, and (b) the more parameters there are in a model, the more difficult it is to replicate its structure should future validation research be conducted. In the case of the model tested in this chapter, I considered the addition of three structural paths to be justified both substantively and statistically. From the statistical perspective, we can see in Table 7.11 that the addition of each new parameter resulted in a statistically significant difference in fit from the previously specified model. The inclusion of these three additional
Table 7.11 Summary of Specifications and Fit Statistics for Tested Models of Burnout Model
Parameter
Parameters
Added
Deleted
1 Initial GA(2,2) ROLEC -> XLOCUS GA(l,3) WORK -> SELF GA(4,4) CLIMATE -> DP
5 Final
BE(5,2) XLOCUS -> PA GA(l,5) DECISION -> SELF GA(l,6) SSUPPORT -> SELF GA(4,2) ROLEC -> DP GA(5,l) ROLEA -> PA
2 X
df
CFI
ECVI
RMSEA
1305.02
442
.91
2.47
.06
1237.98
441
.92
2.36
.05
67.04
1184.65
440
.93
2.28
.05
53.33
1148.35
439
.93
2.22
.05
36.30
1154.41
444
.93
2.21
.05
6.06
AX2
Adf
252
Chapter 7
paths, and the deletion of five originally specified paths, resulted in a final model that fitted the data well (CFI - .93; RMSEA - .05). Furthermore, based on the ECVI index it appears that the final model has the greatest potential for replication in ~ther samples of elementary teachers, compared with Models 1 through 4. In concluding this chapter, let us now summarize and review findings from the various models tested. Of 14 causal paths specified in the hypothesized model (see Fig. 7.2), 9 were found to be statistically significant for elementary teachers. These paths reflected the impact of (a) Classroom Climate, Work Overload, and Role Conflict on Emotional Exhaustion; (b) Decision Making on External Locus of Control; (c) Self-Esteem, Emotional Exhaustion, and Depersonalization on Perceived Personal Accomplishment; (d) Emotional Exhaustion on Depersonalization; and (e) Peer Support on Self-Esteem. Three paths not specified a priori (role conflict ~ external locus of control; work overload ~ self-esteem; classroom climate ~ depersonalization) proved to be essential components of the causal structure; they were therefore added to the model. Finally, five hypothesized paths (external locus of control ~ personal accomplishment; decisionmaking ~ self-esteem; superior support ~ self-esteem; role ambiguity ~ personal accomplishment) were not significant and were subsequently deleted from the model. In comparing results from the analyses conducted here using LISREL 8, with those of the original study (Byrne, 1994a) using EQS, I find it distressing, albeit not surprising, that some discrepancy exists between the two applications with respect to the post hoc analyses. Indeed, it seems evident that these discrepancies arose as a consequence of the alternate approaches taken by the two programs to the identification of misspecified and nonsignificant model parameters. Whereas the modification indices employed by LISREL are derived univariately, those employed by EQS are based on the Lagrange Multiplier Test (LMTEST; Aitchison & Silvey, 1958; Lee & Bentler, 1980), which considers fixed parameters in the model multivariately. Similarly, whereas the identification of nonsignificant model parameters are determined within the LISREL framework using a univariate approach, EQS uses a multivariate approach to test for nonsignificant parameters in the model via the Wald Test (Wald, 1943). Although the modification indices from both programs were consistent in suggesting the addition of three, albeit the deletion of five structural paths, the identified parameters differed somewhat from one program to the other.
253
Application 5
More specifically, whereas LISREL 8 targeted the inclusion of paths flowing from Role Conflict to External Locus of Control (ROLEC ~ XLOCUS) and from Work Overload to Self-Esteem (WORK ~ SELF), the EQS program pinpointed those from Work Overload to External Locus of Control (WORK ~ XLOCUS) and from Self-esteem to External Locus of Control (SELF ~ XLOCUS); the path from Classroom Climate to Depersonalization (CLIMATE ~ DP) remained the same across programs. Likewise, there was some discrepancy with respect to parameters that were candidates for deletion (i.e., parameters that were statistically nonsignificant). Of the five structural paths considered to be redundant in the present application, one differed from the original study. Whereas LISREL 8 yielded a nonsignificant estimate for the structural path flowing from Decision Making to Self-Esteem (DECISION ~ SELF), this path was not identified by EQS; rather, the Wald Test suggested the deletion of the path from Work Overload to Emotional Exhaustion (WORK ~ EE). The aforementioned discrepancies notwithstanding, in general we can conclude from this application, as well as from the original study, that role conflict, work overload, classroom climate, participation in the decisionmaking process, and the support of one's colleagues are potent organizational determinants of burnout for high school teachers. the process appears to be tempered, however, by one's general sense of self-worth and locus of control.
FROM THE SIMPLIS PERSPECTIVE For an example of a SIMPLIS file in this chapter, let us examine one related to the Initial model as shown in Table 7.12. Although most of the setup here will be familiar to you at this point, there are a few specific points worthy of note. First, note that, consistent with the LISREL input file (Table 7.4), the data are to be found in the same data file. However, in contrast to the LISREL file which included a fixed format statement, the SIMPLIS file requires that this specification constitute the first line of the data set. Second, appearing first under the EQUATIONS keyword are specifications bearing on the measurement model, followed by those for the structural model. For ease of interpretation, however, I have divided these specifications into three separate sections. The first group of specifications relate to the originally hypothesized CF A model linking the
Table 7.12 SIMPLIS Input: Initial Model of BUrnout for Elementlry Teachers f Testing for Invadance of Causal Structure of Burnout for Elementary teechers"BURNELI .SIMII I Initally Hypothesized Model _VAltIMLfS: ROLEAl ROLEA2 ROLECI ROLEC2 WORKI WORK2 CLIMATE 1 CLIMATE2 CLlMATE3 CLIMATE4 DECI DEC2 SSUPI SSUP2 PSUPI PSUP2 SELFI SELF2 SELF3 nOCl XLOC2 XlOC3 XlOC4 XlOC5 EE 1 EE2 EE3 DPI DP2 PA 1 PA2 PAl lAW DATA f .... FILE ELlllllL.DAT SAMPLE SIZE: 599 LATEIIT VAlU_ES: SELF XlOCUS EE DP PA ROlEA ROlEC WORK CLIMATE DECISION SSUPPORT PSUPPORT EGaTlIIIS:
= = = = = (.8)*XLOCUS = (.8)*XLOCUS
SELFI I*SElF SElF2 (.8)*SELF SELF3 • (.B)*SELF XLOCI I*XLOCUS XLOC2 (.8)*XLOCUS XlOC3 (.B)*XlOCUS XLOC4
E
XLOC5 EEl I*EE EE2 (1)*EE EE3 9)*EE DPI = lOOP DP2 9)*DP PAl = '·PA PA2 = (.8)*PA PAl (.9)*PA ROLEAl I*ROlEA ROLEA2 (1 )*ROLEA ROlECI I*ROLEC ROlEC2 (1. n*ROlEC WORKI I*WORK WORK2 • (. 9)*IIORK CLIMATE I = I*CLlMATE CLIMATE2 (.6)*CLlMATE CliMATE3 (.6)*CLlMATE CLIMATE4 (.6)*CLlMATE DEC 1 '*DECISION
= = = (. = (. =
= = = = =
= = =
=
DEC2 • (1)*DECISION
SSUP1 -= '*SSUPPORT SSUP2 (1 )*SSUPPORT PSUP1 = ,*PSUPPORT PSUP2 = (I )*PSUPPORT ROLEAl = (·.4)*DECISION ROLEA2 = (·.5)*DECISION ROlECl (·.2)*SSUPPORT SELF DECISION SSUPPORT PSUPPORT XLOCUS DECISION EE • ROlEC WORK CLIMATE
=
=
= =
DP • EE ROLEe
PA = EE DP ROLEA SELF XlOCUS SET the error variance of EE to .001 SET the coYariance.of ROLEA and CLIMATE to 0.0 SET the covariance of ROlEA and DECISION to 0.0 SET the covariance of ROLEA and SSUPPORT to 0.0 SET the covariance of ROLEA and PSUPPORT to 0.0
SET the covariance of ROLEe and CLIMATE to 0.0 SET the covariance of ROlEe and DECISION to 0.0 SET the covariance of ROlEe and SSUPPORT to 0.0 SET SET SET SET SET SET SET SET
the the the the the the the the
covariance covariance covariance covariance covariance covariance covari.nee covar ••nee
lISREl (lJTPUT END OF PROSlEM
254
of of of of of of of of
ROLEe and PSUPPORT to 0.0 \IORK and CLIMATE to 0.0 WORK and DECISION to 0.0 WORK and SSUPPORT to 0.0 \KJRK and PSUPPORT to 0.0 CLIMATE and DEetslON to 0.0 CLIMATE and SSUPPORT to 0.0 CLIMATE and PSUPPORT to 0.0
255
Application 5
indicator variables to their respective factors. The second group of three equations relate to the cross-loadings as determined from tests of the original measurement model; the third set of specifications pertain to the structural paths. Third, consistent with the LISREL file, the residual error term associated with Emotional Exahustion has been constrained to .001. Finally, because the hypothesized model specified only six factor correlations among the independent latent factors a priori, the remaining 15 correlations in the factor variance/covariance matrix must be specified as being equal to zero. This requirement, of course, derives from the fact that SIMPLIS presumes all independent factors to be intercorrelated.
APPLICATIONS RELATED TO OTHER DISCIPLINES Business: Alford, B. L., & Sherrell, D. L. (1996). The role of affect in consumer satisfaction judgements of credence-based services.Journal o/Business Research, 37, 71-84. Education: Hagedorn, L. S. (1996). Wage equity and female faculty job satisfaction: The role of wage differentials in a job satisfaction causal model. Research in Higher Education, 37, 569-598. Gerontology: Gignac, M. A. M., Kelloway, E. K., & Gottlieb, B. H. (1996). The impact of caregiving on employment: A mediational model of work-family conflict. Canadian Journal on Aging, 15,525-542. Leisure Studies: Lindberg, K., & Johnson, R. (1997). Modeling resident attitudes toward tourism. Annals 0/ Tourism Research, 24, 402-424. Marketing: Thompson, K. N., & Getty,J. M. (1994). Structural model of relations among quality, satisfaction, and recommending behavior in lodging decisions. Structural Equation Model· ing: A Multidisciplinary Journal, 1, 146-160.
PART
THREE
Multiple-Group Analyses
Confirmatory Factor Analytic Models CHAPTER 8: APPLICAnON 6
Testing/or Invariant Factorial Structure 0/a Theoretical Construct {First-Order CFA Model}
CHAPTER 9: APPLICATION 7 Testing/or Invariant Factorial Structure o/Scores From a Measuring Instrument {First-Order CFA Model}
CHAPTER 10: APPLICATION 8 Testing/or Invariant Latent Mean Structures
The Full Latent Variable Model CHAPTER 11: APPLICATION 9 Testing/or Invariant Pattern o/Causal Structure
CHAPTER
8
Application 6: Testingfor Invariant Factorial Structure of a Theoretical Construct (First-Order CFA Model) Up to this point, all applications have illustrated analyses based on single samples. In this section, however, we focus on applications involving more than one sample where the central concern is whether or not components of the measurement model the structural model, or both are invariant (i.e., equivalent) across particular groups. In seeking evidence of multigroup invariance, researchers are typically interested in finding the answer to one of five questions. First, do the items comprising a particular measuring instrument operate equivalently across different populations (e.g., gender, age, ability)? In other words, is the measurement model group-invariant? Second, is the factorial structure of a single instrument, or of a theoretical construct measured by multiple instruments, equivalent across populations? Here, invariance of the structural model is of primary interest and in particular, the equivalency of relations among the theoretical constructs. Third, are certain paths in a specified causal structure invariant across populations? Fourth, are the latent means of particular constructs in a model different across populations? Finally, does the factorial structure of a measuring instrument replicate across independent samples of the same population? This latter question, of course, addresses the issue of cross-validation. Applications presented in the next four chapters provide you with specific examples of how each of these questions can be answered using structural equation modeling based on LISREL 8. 259
260
Chapter 8
In our first multigroup application, we test hypotheses related to the equivalencies of self-concept measurement and structure across male and female high school-age adolescents. In particular, we wish to determine if multidimensional facets of self-concept (general, academic, English, mathematics), and the multiple assessment instruments used to measure them, are equivalent across gender for this population. This application is taken from a study by Byrne and Shavelson {1987} as a follow-up to their earlier work that validated the structure of self-concept for high school-age adolescents in general {Byrne & Shavelson, 1986}.
TESTING FOR MULTIGROUP INVARIANCE The General Framework Development of a procedure capable of testing for invariance simultaneously across groups derives from the seminal work of Joreskog (1971a). Accordingly, Joreskog recommended that all tests of invariance begin with a global test of the equality of covariance structures across groups. In other words, one tests the null hypothesis (Ho) that :EI - :E2 = ••• :EG, where :E is the population variance-covariance matrix, and G is the number of groups. Rejection of the null hypothesis therefore argues for the nonequivalence of the groups and, thus, for the subsequent testing of increasingly restrictive hypotheses in order to identify the source of non in variance. On the other hand, if Ho cannot be rejected, the groups are considered to be equivalent and, thus, tests for invariance are unjustified; group data should be pooled and all subsequent investigative work based on single-group analyses. Although this omnibus test appears reasonable and is fairly straightforward, it often leads to contradictory findings with respect to equivalencies across groups. For example, sometimes the null hypothesis is found tenable, yet subsequent tests of hypotheses related to the invariance of particular measurement or structural parameters must be rejected {see, e.g., Joreskog, 1971a}. Alternatively, the global null hypothesis may be rejected, yet tests for the invariance of measurement and structural invariance hold {see e.g., Byrne, 1988c}l. Such inconsistencies in the global test for invariance stem
I Although
trivial inequalities related to particular errors of measurement were identified.
Application 6
261
from the fact that there is no baseline model for the test of invariant variance-covariance matrices, thereby making it substantially more stringent than is the case for tests of invariance related to sets of parameters in the model (B. o. Muthen, personal communication, October, 1988). Indeed, Muthen contended that the omnibus test provides little guidance in testing for equality across groups and, thus, should not be regarded as a necessary prerequisite to the testing of more specific hypotheses related to group invariance. In testing for equivalencies across groups, sets of parameters are put to the test in a logically ordered and increasingly restrictive fashion. Depending on the model and hypotheses to be tested, the following sets of parameters are most commonly of interest in answering questions related to group invariance: (a) factor loading paths (AS), (b) factor variances/covariances (cjls), (c) structural regression paths (ys; ps); of less interest are tests for the invariance of error variances/covariances (os; es) and disturbance terms (~s). In general, except in particular instances when, for example, it might be of interest to test for the invariant reliability of an assessment measure across groups (see, e.g., Byrne, 1988c), the equality of error variances and covariances is probably the least important hypothesis to test (Bentler, 1995). Although the Joreskog tradition of invariance testing holds that the equality of these parameters should be tested, it is now widely accepted that to do so represents an overly restrictive test of the data.
The General Procedure In the Joreskog tradition, tests of hypotheses related to group invariance typically begin with scrutiny of the measurement model. In particular, the pattern of factor loadings for each observed measure is tested for its equivalence across the groups. Once it is known which measures are group-invariant, these parameters are constrained equal while subsequent tests of the structural parameters are conducted. (It is possible that not all measures are equal across groups, as we see later.) As each new set of parameters is tested, those known to be group-invariant are constrained equal. Thus, the process of determining nonequivalence of measurement and structural parameters across groups involves the testing of a series of increasingly restrictive hypotheses. Given the univariate approach to testing these hypotheses using the LISREL program, the orderly sequence of analytic steps is both necessary and strongly recommended.
262
Chapter 8
Preliminary Single-Group Analyses As a prerequisite to testing for factorial invariance, it is customary to consider a baseline model that is estimated separately for each group separately. This model represents the best-fitting one to the data both from the perspective of parsimony and from the perspective of substantive meaningfulness. Because the "l statistic and its degrees of freedom are additive, the 2 sum of the 1., S reflects the extent to which the underlying structure fits the data across groups. However, because measuring instruments are often group-specific in the way they operate, baseline models are not expected to be identical across groups. For example, whereas the baseline model for one group might include cross-Ioadings/ error covariances, or both, this may not be so for other groups under study. A priori knowledge of such group differences, as is illustrated later, is critical to the application of invariance-testing procedures. Although the bulk of the literature suggests that the number of factors must be equivalent across groups before further tests of invariance can be conducted, this strategy represents a logical starting point only, and is not a necessary condition; indeed, only the similarly specified parameters within the same factor need be equated (Werts, Rock, Linn, & Joreskog, 1976). Because the estimation of baseline models involves no between-group constraints, the data can be analyzed separately for each group. However, in testing for invariance, equality constraints are imposed on particular parameters and, thus, the data for all groups must be analyzed simultaneously to obtain efficient estimates (Bentler, 1995; Joreskog & Sorbom, 1996a); the pattern of fixed and free parameters nonetheless remains consistent with the baseline model specification for each group. In summary, tests for invariance can involve both measurement and structural components of a model, the particular combination varying in accordance with the model under study. For example, in the case of first-order CF A models, the pattern of factor loadings (the A matrix) and structural .relations among the factors (the matrix) are typically of most interest; with second-order models, the focus is on both the lower and higher order factor loadings (the r matrix), and with full SEM models, the focus in on structural paths among the factors (the rand B matrices). 2The loading of a variable on more than one factor.
Application 6
263
TESTING FOR INVARIANCE ACROSS GENDER The Hypothesized Model The model on which the application illustrated in the present chapter was originally based (Byrne & Shavelson, 1987) derived from an earlier construct validity study of adolescent self-concept structure (Byrne & Shavelson, 1986). In particular, the model argued for a four-factor structure comprising general self-concept, academic self-concept, English self-concept, and mathematics self-concept, with the four constructs assumed to be intercorrelated. A schematic representation of this model is shown in Fig. 8.1 Except for academic self-concept, each of the other dimensions is measured by three independent measures. In the case of academic self-concept, findings from a preliminary factor analysis of the Affective Perception Inventory {Soares & Soares, 1979} revealed several inadequacies in its measurement of the construct and, thus, it was deleted from all subsequent analyses. It was originally hypothesized that each subscale measure would have a nonzero loading on the self-concept factor it was designed to measure, and a zero loading on all other factors; errorluniquenesses associated with each of the measures were assumed to be uncorrelated. Relations among the four self-concept facets represent the structural model to be tested for its equivalency across groups. At issue in the Byrne and Shavelson {1987} study was whether the structure of self-concept, and the instruments used in measuring components of this structure as depicted in Fig. 8.1, were equivalent across adolescent males and females. Of primary import, therefore, was the invariance of both the measurement and structural models.
The Baseline Models As noted earlier, prior to testing for invariance across multigroup samples, it is customary to first establish baseline models separately for each group under study. The baseline self-concept models for adolescent males and females, as reported by Byrne and Shavelson {1987}, are shown in Fig. 8.2. In fitting the baseline model for each sex, Byrne and Shavelson {1987} reported a substantial drop in 'l when the English self-concept subscale
SDQGSC SDQGSC APIGSC APIGSC
GENSC
SESGSC SESGSC
SDQASC SDQASC ACADSC SCAASC SCAA~
SDQESC APIESC
ENGSC
SCAESC
SDQMSC APIMSC APIMSC
MATHSC MATHSC
SCAMSC SCAMSC FIG. 8.1. Hypothesized Four-factor Model of Adolescent Self-concept. From A Primer of Ll5REL: Basic Applications and Programming for Confirmatory Factor Analytic Models (p. 128), by B. M. Byrne, 1989, New York: Springer-Verlag. Reprinted with permission.
264
s:;
I'-.)
MATHSC
SCAMSC
APIMSC APIMSC
MATHSC
ENGSC
ACADSC
GENSC
FIG. 8.2. Baseline Models of Self-concept for Adolescent Males and Females. Adapted from Byrne, B. M., & Shavelson, R. J. (1987). Adolescent self-concept: Testing the assumption of equivalent structure across gender. American Educational Research Journal, 24 (4), 365-385. (Figure 1, p. 374). Copyright 1987 by the American Educational Research Association. Adapted by permission of the publisher.
SCAMSC
APIMSC APIMSC
SDQMSC
SDQMSC
APIESC
SDQESC
SCAASC
SDQASC
SESGSC i
APIGSC
SCAESC
ENGSC
ACAD SC \ " ACADSC
GENSC
SDQGSC
FEMALES
SCAESC
APIESC
SDQESC
SCAASC
SDQASC
SESGSC
APIGSC
SDQGSC
MALES
266
Chapter 8
(SDQESq of the Self Description Questionnaire ill (SDQill; Marsh, 1992b) was free to cross-load onto the general SC factor (A,61)' Moreover, for males only, the mathematics SC subscale of the SDQ ill (SDQMSq was allowed to cross-load onto the English SC factor (~3). Finally, based on their substantial size and substantive reasonableness, five error covariances were included in the baseline model for males, and three in the baseline model for females. With the aforementioned specifications in place, these models were considered optimal in representing the data for adolescent males and females. Overall fit of the male model was X2(J1) - 60.96, with a GFI of .97 and CFI of .99; for the female model, it was X2(J4) - 77.30, with a GFI of .97 and a CFI of.99.
Hypothesis-Testing Strategy In testing for the invariance of self-concept across gender, we consider three hypotheses: (a) that the number of underlying factors is equivalent, (b) that the pattern of factor loadings is equivalent, and (c) that structural relations among the four facets of the construct are equivalent. J The data used in testing these hypotheses are in the form of a correlation matrix, and for purposes of the present application, we embed them into the LISREL input file. The standard deviations are also included, thereby enabling the analyses to be based on the covariance matrices. We turn now to the complete LISREL input file related to the first hypothesis, which is listed in Table 8.1.
Hypothesis I: Testing for the Validity of a Four-factor Structure LlSREL Input File There are several unique features of the multigroup file that can be illustrated using this initial input file; only pertinent portions of subsequent input files are shown. First, as noted earlier, when analyses focus on multigroup comparisons with constraints between the groups (i.e., particular parameters are constrained equal across groups), it is imperative that parame-
JGiven that the testing of equality constraints related to error variances and covariances is now considered to be excessively stringent, these analyses were not conducted.
Application 6
267
Table 8.1 LISREL Input for SillLll taneous Estimation of Male and Female Basel ine Models !TESTING for Invariance of Self-concept Across Gender "SCINVSIM" ! GROUP 1=MALES !SillLlltaneous Run DA NG=2 NI=15 NO=412 MA=CM LA SOQGSC SOQASC SOQESC SOQMSC APIGSC SESGSC APIASC SCAASC APIESC SCAESC APIMSC SCAMSC GPA ENG MATH KM SY (15F4.3) 1000 3451000 287 3771000 252 579 1141000 637 335 253 2551000 768 339 302 278 6001000 543 640 382 451 656 5661000 254 703 337 591 312 313 5871000 166 429 724 143 264 216 445 3641000 104 411 506 142 199 163373 551 63010000 247 601 202 880 288 309 530 621 290 1991000 208 526 127 827 253 231 473 676 155 234 8081000 035 517 117481 067080374 687 121 340434 5241000 -020 410 158 272 003 062 297 507 215 492 261 294 7861000 040 378 060 609 076 074 295 525 077 172 560 649 746 5331000 SO 13.856 13.383 10.017 15.977 9.442 4.889 9.999 5.943 11.099 5.929 11.762 7.873 11.232 12.664 16.898 SELECTION 1 5 6 2 8 3 9 10 4 11 12/ III NX=ll NK=4 LX=FU,FI PH=SY,FR TO=SY LK GENSC ACAOSC ENGSC MATHSC FR LX(2,l) LX(3,ll LX(5,2) LX(7,3) LX(8,3) LX(10,4) LX(11,4) FR LX(6,ll LX(9,3) FR TO(8,5) TO(10,7) TO(11,5) TO(11,8) T0(7,2) VA 30.0 LX(l,l) LX(4,2) LX(6,3) LX(9,4) ST 15.0 LX(2,l) LX(3,l) LX(5,2) LX(7,3) LX(8,3) LX(10,4) LX(11,4) ST 5.0 LX(6,l) ST -5.0 LX(9,3) ST .1 PH( 1,1) PH(2,2) PH(3,3) PH(4,4) ST .05 PH(2,l) PH(3,l) PH(4,l) PH(3,2) PH(4,2) PH(4,3) ST 40.0 TO(l,l) TO(2,2) TO(4,4) TO(6,6) ST 15.0 TO(3,3) TO(5,5) TO(7,7) TO(8,8) TO(9,9) TO(10,10) T0(11,ll) ST 6.0 TO(8,5) TO(10,7) TO(11,5) TO(11,8) T0(7,2) OU NS GROUP 2 = FEMALES DA NO=420 LA SOQGSC SOQASC SOQESC SOQMSC APIGSC SESGSC APIASC SCAASC API ESC SCAESC APIMSC SCAMSC GPA ENG MATH
ters for all groups be estimated simultaneously. This is because in multiple group models the fitting function represents a weighted combination of model fit across groups (see Bollen, 1989b). For this to be possible, model specifications for each group must reside in the same file. This is accomplished by stacking the input file for each group, one after the other. For example, in the present case, model specifications for males appear first (n -
268
Chapter 8
Table 8.1 cont. KM SY (15F4.3) 1000 2931000 . 278 3771000 109 379-0681000 656 300 211 1291000 825 331 311 160 6771000 519 569 329 321 595 5661000 191 659 336 378 137 263 4951000 146 400 677-042 185 211 397 4091000 168 474 554 026 140 233 353 626 6571000 123 406-026 857 186 185 388 408 071 0491000 079 362-039 822 082 124 320 509-026 080 8061000 014 487 165 394 027 096 345 675 115 344 316 4671000 022 451 252 257 058 083 305 557 269 554 175 250 7791000 -056 272-006 550-033-001 216 439-059 073 475 662 732 4641000 SO 14.524 11.014 9.479 16.3478.853 5.0198.6564.873 10.701 5.319 11.073 7.0879.828 11.655 14.643 SELECTION
1 5 6 2 8 3 9 10 4 11 12/ 110 LX=FU, FI .PH=SY, FR TO=SY LI(
GENSC ACAOSC ENGSC MATHSC
FR FR FR VA ST ST ST ST ST ST ST ST ST
LX(2,l) LX(3,l) LX(5,2) LX(7,3) LX(8,3) LX(10,4) LX(11,4) LX(6,l) TO(8,5) TO(10,7) TO(11,5) 30.0 LX(l,l) LX(4,2) LX(6,3) LX(9,4) 15.0 LX(2,l) LX(3,l) LX(5,2) LX(7,3) LX(8,3) LX(10,4) LX(11,4) 5.0 LX(6,1) 1.0 PH(l,l) PH(2,2) PH(3,3) PH(4,4) .05 PH(2,l) PH(3,l) PH(4,1) PH(3,2) PH(4,2) - .01 PH(4,3) 40.0 TO(1,1) TO(2,2) TO(4,4) TO(6,6) TO(7,7) TO(9,9) 10.0 TO(5,5) TO(8,8) TO(ll,ll) TO(10,7) 18.0 TO(10,10) 5.0 TO(8,5) TO(11,5) TO(3,3)
OU NS
412), followed by those for females (n - 420); each terminates with the OU line. Second, on the DA card, you will note the specification, NG-2, indicating that the number of groups equals two. Because this command alerts the program to the number of input data sets to be included-in the simultaneous analysis, it is essential that it be specified on the DA line for the first group; DA lines for all remaining groups need only include the sample size. Finally, given that start values have been included in the file, the keyword NS has been added to the OU line for each group. Testing that self-concept structure is best described by a four-factor structure for both adolescent girls and boys is a logical starting point in tests for invariance using the LISREL approach. However, this particular test differs from the rest in that no equality constraints are specified across groups. Rather, the tenability of the hypothesized structure rests on the
Application 6
269
overall goodness-of-fit for the combined file shown in Table 8.1, a satisfactory fit arguing for an equivalent number of factor structures to best represent the data across groups. In contrast to single-group analyses, however, multigroup analyses yield only one set of fit statistics for overall X2 model fit, albeit the X2 value for the first group, together with its computed percentage to the overall X2 statistic, is also reported. As noted earlier, because X2 statistics are summative, the overall value for the multigroup model will equal the sum of this statistic for each group. Let us go back now and again review the baseline models under study (Table 8.1). Turning first to specifications for adolescent males, we see that there are two cross-loadings (~I; A.93), in addition to five error covariances (0805; 01007; 0,,05; 0,,08; 0~2)' over and above the initially hypothesized model. By way of contrast, specifications for adolescent females include only one cross-loading (A.61), and three! error covariances (0805; 01007; 0,,05), Thus, prior to testing for invariance, we already know that three parameters are different across the two groups. As such, these parameters are estimated freely for males, and are not constrained equal across males and females. With the exception of these three parameters, then, all remaining estimated parameters can be tested for their invariance across groups. This situation provides an excellent example of what is termed "partial measurement invariance" (see Byrne et al., 1989). Let us turn now to relev~nt portions of the output file related to the testing of Hypothesis I that are presented in Table 8.2
LlSREL Output File As stated in chapter 6 with respect to multitrait-multimethod models, I believe that the parameter specifications presented in the LISREL output will be particularly helpful to you in tracking the estimated versus constrained parameters· tested in this chapter; thus, they are included in the discussion of each output file. Turning to this information related to the first multigroup model, you will see that each numbered parameter represents a different number, thereby indicating that no equality constraints have been imposed. As such, from 132 pieces of information-(ll x 12)/2 for each group- there are 67 parameters to be estimated, leaving 65 degrees of freedom. Reviewing the fit statistics for this initial model, we can conclude that the overall fit is consistent with one that is well fitting (RMSEA .... 05; GFI .... 97; CFI =
Chapter 8
270
Table 8.2 L1SREL OUtllYt: Parameter Estimates and Fit Statistics for Basel ine Models
PARMETER SPECIFlCATlmlS LAMBDA-X
SDQGSC APIGSC SESGSC SDQASC SCAASC SDQESC APIESC SCAESC SDQMSC APIMSC SCAMSC
GENSC
ACADSC
ENGSC
MATHSC
0 1 2 0 0 4 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0 0
0 0 0 0 0 0 5 7 0 0
0 0 0 0 0 0 0 0 0 8 9
GENSC
ACADSC
ENGSC
MATHSC
12 14 17
15 18
19
SDQGSC
APIGSC
SESGSC
SDQASC
SCAASC
SDQESC
20 0 0 0 0 0 0 0 0 0 0
21 0 0 0 0 26 0 0 0 0
22 0 0 0 0 0 0 0 0
23 0 0 0 0 0 0 0
24 0 0 28 0 0 33
25 0 0 0 0 0
APIESC
SCAESC
SOQMSC
APIMSC
SCAMSC
27 0 0 31 0
29 0 0 34
30 0 0
32 0
35
6
PHI
GENSC ACADSC ENGSC MATHSC
10 11
13
16
THETA-DEL TA
SDQGSC APIGSC SESGSC SDQASC SCAASC SDQESC APIESC SCAESC SDQMSC APIMSC SCAMSC
THETA-OEL TA
APIESC SCAESC SOQMSC APIMSC SCAMSC
.99. It is important to note, however, that although these results suggest that self-concept structure is most appropriately described by a four-factor model for both adolescent boys and girls, it does not necessarily imply that the pattern of factor loadings is the same across gender. This hypothesis remains to be tested and we turn to this issue next.
Table 8.2 cont. GROUP 2
= FEHALES
PARAMETER SPECifiCATIONS LAHBDA-X ACADSC
ENGSC
HATHSC
36 37 0 0 39 0 0 0 0 0
0
0 0 0 0 38 0 0 0 0 0 0
0 0 0 0 0 0 40 41 0 0 0
0 0 0 0 0 0 0 0 0 42 43
GENSC
ACADSC
ENGSC
HATHSC
44 45 47 50
46 48 51
49 52
53
SDQGSC
APIGSC
SESGSC
SDQASC
SCAASC
54 0 0 0 0 0 0 0 0 0 0
55 0 0 0 0 0 0 0 0 0
0
57 0 0 0 0 0 0 0
58 0 0 61 0 0
SCAESC
SOQHSC
APIHSC
SCAHSC
62 0 0 0
63 0 0
65 0
67
GENSC SDQGSC APIGSC SESGSC SDQASC SCAASC SDQESC APIESC SCAESC SDQHSC APIHSC SCAHSC PHI
GENSC ACADSC ENGSC HATHSC
THETA-DEL TA
SDQGSC APIGSC SESGSC SDQASC SCAASC SDQESC APIESC SCAESC SDQHSC APIHSC SCAHSC
56 0 0 0 0 0 0
0
SDQESC
66
59 0 0 0 0 0
THETA-OEL TA APIESC APIESC SCAESC SOQHSC APIHSC SCAMSC
60 0 0
64
0
GIlIIlNESS Of fIT STATISTICS
= 0.00000033) = 73.26 (RMSEA) = 0.052
CHI-SQUARE IIITH 65 DEGREES OF FREEDOM = 138.26 (P ESTIMATED NON-CENTRALITY PARAMETER (NCP)
ROOT MEAN SQUARE ERROR Of APPROXIMATION EXPECTED CROSS-VALIDATION INDEX (ECVI) = 0.33 STANDARD I ZED RMR 0.037 GOOONESS OF FIT INDEX (GFI) 0.97 COMPARATIVE FIT INDEX (CFI) = 0.99
=
=
271
272
Chapter 8
Hypothesis II: Testing for an Invariant Pattern of Factor Loadings LlSREL Input file It is important to note that the entire file, as shown in Table 8.1, remains the same throughout all subsequent analyses, with the exception of specified equality constraints. Because these constraints are always specified in terms of the last group, only input beginning at the MO line for this group is shown for the present, as well as the remaining models. As such, the input file bearing on Hypothesis IT is presented in Table 8.3. The critical part of this file involves the equality constraints, indicated by the keyword EQ, immediately preceding the OU line. Three important features of equality constraints, in general, are worthy of note. First, the specification of any matrix element in a single group is expressed by means of two numerals within parentheses (e.g., LX[2,1]. Reference to some matrix element in another group, however, requires the inclusion of three numerals within parentheses, where the first numeral identifies the group of interest. For example, the specification of LX(2,2,1) refers to the element LX(2,1} in Group 2; LX(3,2, 1} refers to the same elementfor Group 3. Second, in defining equality constraints between groups, the parameter to be constrained is specified as free
Table 8.3 L1SREL Input: Testing for an Invariant Pattern of Factor loadings 110 lX=FU,FI PH=SY,FR TO=SY lK GENSC ACADSC ENGSC MATHSC FR lX(2,l) lX(3,l) lX(5,2) lX(7,3) lX(B,3) lX(10,4) lX(11,4) FR lX(6,l) FR TO(B,5) TO(10,7) TO(11,5) VA 30.0 lX(I,I) lX(4,2) lX(6,3) lX(9,4) ST 15.0 lX(2,1) lX(3,l) lX(5,Z) lX(7,3) lX(B,3) lX(10,4) lX(11,4) ST 5.0 lX(6,1> ST 1.0 PH(l,l) PH(Z,2) PH(3,3) PH(4,4) ST .05 PH(2,l) PH(3,1) PH(4,1) PH(3,Z) PH(4,2) ST -.01 PH(4,3) ST 40.0 TO(l,1> TO(2,2) TO(4,4) TO(6,6) TO(7,7) TO(9,9) ST 10.0 TO(5,5) TO(B,B) TO(II,11) TO(10,7) ST lB.O TO(10,10) ST 5.0 TO(B,5) TO(11,5) TO(3,3) EQ lX(I,2,1) lX(2,l) EQ lX(l,3,1) lX(3,1) EQ lX(l,5,2) lX(5,2) EQ lX(l,7,3) lX(7,3) EQ lX( I,B,3) lX(B,3) EQ lX(I,10,4) lX(10,4) EQ lX(I,11,4) lX(II,4) EQ lX(l,6,l) lX(6,l) QU NS
Application 6
273
in the first group, albeit equated (as indicated by the EQ keyword) to the first group for all remaining groups. For example, in testing for the invariance of the element LX{2,1) across three groups, the specification would be: Group 1: FR LX{2,1} Group 2: EQ LX{1,2,1} LX{2,1} Group 3: EQ LX{1,2,1} LX{2,1}
Accordingly, the element LX{2,1) will be freely estimated for Group 1 only; for Groups 2 and 3, the value ofLX{2,1) will be constrained equal to the estimate obtained for Group 1. Finally, only estimated parameters can have equality constraints imposed on them (i.e., fixed parameters are not eligible). Turning to the equality constraints shown in Table 8.3, we see that all factor loadings, including the common cross-loading (A..1), but not including the group-specific loading of A.93 for males are being tested for their equivalence across gender. Although it is also possible to specify entire matrices invariant, as we shall see with respect to Hypothesis ill, this is not feasible here, given the situation of the additional cross-loading for males.
LlSREL Output file Only the parameter specifications section of the output file for this (and subsequent invariance models) are shown in Table 8.4; a summary of all goodness-of-fit statistics is presented later in Table, 8.9 . Turning to the Lam bda matrix, however, you will see that all parameters, except the one indicated by the numeral 7 (which represents A.93), are the same for both the male and female groups; all other estimated parameters have different numerals. This pattern, of course, indicates that the A. estimates for Group 2 will be constrained equal to those from Group 1. In testing the validity of these constraints, we compare the X2value for this model (X2 (73) = 145.37), with the initial model in which no constraints were imposed (X 2(65) - 138.26). Provided with findings of a nonsignificant difference in X2 values, we conclude that the hypothesis of an invariant pattern of factor loadings held. In this regard, the f!.i(8) is 7.11 with 8 degrees of freedom, which is nonsignificant. Given these findings, we can now feel confident that all measures of self-concept are operating in the same way for both groups, and we proceed in testing for equality of the structural parameters. No doubt you are wondering what the procedure would have been had not all measurement
274
Chapter 8
parameters been found equal across the groups. Before continuing with our invariance testing, then, I think it best to address this issue. Essentially, such results would invoke the use of partial measurement invariance in testing for the equality of factor variances/covariances. However, the use of partial measurement invariance is contingent on the number of indicator variables used in measuring each latent construct. Given multiple indicators and at least one invariant measure (other than the one fixed to 1.00 for identification purposes), remaining noninvariant measures can be specified as unconstrained across groups; that is, they can be freely estimated (Byrne et aI., 1989; Muthen & Christoffersson, 1981).
Hypothesis III: Testing for Invariant Factor Variances/Covariances LlSREL Input File We turn now to the testing of equality constraints related to the factor variances and covariances! The LISREL input related to these specifications is presented in Table 8.5. In testing for the invariance of factor variances and covariances ac~oss gender, the hypothesis of interest here focuses on the structure of self-concept. As noted earlier, the testing of invariance hypotheses involves increasingly restrictive models; as such, the model to be tested here (Model 3) is more restrictive than Model 2. Because the equality of the pattern of factor loadings was found to be tenable, these constraints are also maintained for Model 3. In addition, however, the entire variance/covariance matrix is constrained equal across groups.s As shown on the MO line of the input file (Table 8.5), specification of the entirely invariant Phi matrix was indicated by the command PH =IN.
LlSREL Output File Parameter specifications related to Model 3 are presented in Table 8.6. Note first that, as was the case in Model 2, the numbered parameters (i.e., the parameters to be estimated) in the Lambda matrix are the same across groups, with the exception of (7), which represents A.93. In addition, note the 4Although testing for equivalent factor covariances would seem more meaningful than testing for equivalent variances within the context of the present data, we test for the equality of both for purposes of consistency with the original study. 2 SAlthough the AX value was based on the difference between Models 3 and 2 in the original study (Byrne & Shavelson, 1987), it is equally appropriate to compute the difference between Models 3 and 1, with the accompanying difference in degrees of freedom.
Application 6
275
Table 8.4 lISREL OUtPUt: Parameter Specifications for Test of Invariant Factor Loadings PARNIETER SPECIFICATIONS LAMBDA-X
SDQGSC APIGSC SESGSC SOQASC SCAASC SDQESC APIESC SCAESC SOQMSC APIMSC SCAMSC
GENSC
ACADSC
ENGSC
MATHSC
0 1 2 0 0 4 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0 0
0
0 0 5 6 7 0 0
0 0 0 0
0 0 0 0 0 8 9
GENSC
ACADSC
ENGSC
MATHSC
10 11 13 16
12 14
17
15 18
19
APIGSC
SESGSC
SOQASC
SCAASC
22 0 0
23 0
24
0
0 0
PHI
GENSC ACAOSC ENGSC MATHSC
THETA-DEL TA SOQGSC SOQGSC APIGSC SESGSC SOQASC SCAASC SOQESC API ESC SCAESC SDQMSC APIMSC SCAMSC
20 0 0 0 0 0 0
21 0 0 0 0 26 0
0 0 0
0 0 0 0
0 0 0 0 0 0
33
APIESC
SCAESC
SOQMSC
APIMSC
SCAMSC
27 0 0 31 0
29 0 0 34
30 0 0
32 0
35
0
0
0
0
0
0
0 0
28 0
0
SOQESC
25 0
0
0
0 0
THETA-DEL TA
API ESC SCAESC SOQMSC APIMSC SCAMSC
statement in the Group 1 section of the output that "PHI EQUALS PHI IN THE FOLLOWING GROUP"; accordingly, the numbering of the parameters appears in the output for Group 2. All remaining parameters are freely estimated in both groups. Results from the estimation of Model 3 yielded a 'l(83) - 195.30. Because - 57.04) the difference in "l values between this model and Modell (~'l(18) was statistically significant (p < .001), the hypothesis of invariant factor
276
Chapter 8
Table 8.4 cont. GROUP 2 = FEMALES PARAMETER SPECIFICATIOMS LAMBDA-X
SDQGSC APIGSC SESGSC SOQASC SCAASC SDQESC APIESC SCAESC SDQHSC APIHSC SCAHSC
GENSC
ACADSC
ENGSC
HATHSC
0 1 2 0 0 4 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0 0
0 0 0 0 0 0 5 6 0 0 0
0 0 0 0 0 0 0 0 0 8 9
GENSC
ACADSC
ENGSC
HATHSC
36 37 39 42
38 40 43
41 44
45
APIGSC
SESGSC
SDQASC
PHI
GENSC ACADSC ENGSC HATHSC
THETA-DEL TA SDQGSC SDQGSC APIGSC SESGSC SDQASC SCAASC SDQESC APIESC SCAESC SDQHSC APIHSC SCAMSC
46 0 0 0 0 0 0 0 0 0 0
47
SCMSC
SDQESC
51 0 0 0 0 0
a 0 a a a
48
a a
49
a
0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
50 0 0 53 0 0 58
SCAESC
SOQMSC
APIHSC
SCAMSC
55 0 0
57 0
59
THETA-DEL TA APIESC APIESC SCAESC SDQMSC APIHSC SCAHSC
52 0 0 56 0
54 0
a 0
variances and covariances must be rejected. Faced with these. results, our next task is to determine which variances and covariances are contributing to this overall matrix inequality. This procedure involves testing, independently, the invariance of each element in the Phi matrix. As such, a model is specified in which the first element (say, '11) is constrained equal across groups, and
Application 6
277
Table 8.5 Ll5REL Input: Testins for Invariance of Factor Variances and Covariances lID lX=FU, FI PH=IN TO=5Y lK GEN5C ACAD5C ENG5C MATHSC FR lX(2,l) lX(3,1) lX(5,2) lX(7,3) lX(B,3) lX(10,4) lX(11,4) FR lX(6,l) FR TO(B,5) TO(10,7) TO(11,5) VA 30.0 lX(l,l) lX(4,2) lX(6,3) lX(9,4) 5T 15.0 lX(2,l) lX(3,l) lX(5,2) lX(7,3) lX(B,3) lX(10,4) lX(11,4 5T 5.0 lX(6,l) 5T 1.0 PH(l,l) PH(2,2) PH(3,3) PH(4,4) 5T .05 PH(2,l) PH(3,l) PH(4,l) PH(3,2) PH(4,2) 5T - .01 PH(4,3) 5T 40.0 TO(l,1) TO(2,2) TO(4,4) TO(6,6) TO(7,7) TO(9,9) 5T 10.0 TO(5,5) TO(B,B) TO(ll,ll) TO(10,7) 5T lB.O TO(10,10) 5T 5.0 TO(B,5) TO(11,5) TO(3,3) EO lX(l,2,l) lX(2,1) EO lX( 1,3, 1) lX(3,1) EO lX(l,5,2) lX(5,2) EO lX(l,7,3) lX(7,3) EO lX(l,B,3) lX(B,3) EO lX(l,10,4) lX(10,4) EO lX(l,ll,4) lX(11,4) EQ lX( 1,6, 1) lX(6,1) W NS
then comparing this model with Model 2 in which only the pattern of factor loadings is constrained equal across groups. The most rigorous way to conduct this set of tests is to cumulatively constrain all parameters found to be equivalent across groupS.6 Only two input files are shown here. The first one (fable 8.7) represents the initial model tested with only PH(l,l) constrained equal across gender. The second input file represents the last model tested with respect to independent parameters in the Phi matrix; it includes all matrix elements found to be invariant across groups, in addition to the last element tested-PH(4,3)-which was found to be noninvariant. A summary of all tests for invariance is presented in Table 8.9. Focusing on the variances and covariances, let us review the tests conducted for each element in the Phi matrix. Results indicate that the variance of general self-concept ($11) was found to be invariant. Thus, it was held invariant although the variance of academic self-concept ($22) was tested for its equivalency across groups. As shown in Table 8.9, the latter was found to be noninvariant across gender. As a consequence, the constraint for $22 was discontinued. Thus, the model used in testing for the invariance of English self-concept included two constraints in the Phi matrix-one for the variance of general self-concept, the other for the 6In the original study, the noninvariant parameters were tested independently, albeit they were not constrained in a cumulative fashion. Although this procedure is also appropriate, it is a less stringent test of parameter equality.
Table 8.6
S~cif i cat ions for
LI SREL OUtput: Parameter
Invariance of Variances and Covariances
PARAMETER SPECIFICATIONS LAMBDA-X GENSC
ACADSC
ENGSC
MATHSC
0 1 2 0 0 4 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0 0
0 0 0 0 0 0 5 6
0 0 0 0 0 0 0 0 0
SDQGSC APIGSC SESGSC SDQASC SCAASC SOQESC API ESC SCAESC SDQMSC APIMSC SCAMSC
7 0 0
8
SESGSC
SDQASC
SCAASC
SDQESC
0 0 26 0 0 0 0
22 0 0 0 0 0 0 0 0
23 0 0 0 0 0 0 0
24 0 0 28 0 0 33
25 0 0 0 0 0
SCAESC
SOQMSC
APIMSC
SCAHSC
29 0 0 34
30 0 0
32 0
35
GENSC
ACAOSC
ENGSC
HATHSC
0 1 2 0 0 4 0 0 0 0 0
0 0 0 0 3 0 0 0
0 0 0 0 0 0 5 6 0 0 0
0 0 0 0 0 0 0 0 0
9
PHI EQUALS PHI IN THE FOLLOIII NG GROUP THETA-DELTA SOQGSC SOQGSC APIGSC SESGSC SOQASC SCAASC SOQESC APIESC SCAESC SOQMSC APIMSC SCAMSC
APIGSC
20 0 0 0 0 0 0 0 0 0 0
21 0
0
THETA-DEL TA APIESC
27
APIESC SCAESC SOQMSC APIMSC SCAMSC GROUP 2
0 0 31 0
=
FEMALES
PARAMETER SPECIFICATIONS LAMBOA-X
SOQGSC APIGSC SESGSC SOQASC SCAASC SOQESC APIESC SCAESC SOQMSC APIMSC SCAMSC
278
0 0 0
8
9
Application 6
279
Table 8.6 cont. PHI
GENSC ACADSC ENGSC MATHSC
ENGSC
MATHSC
17
15 18
19
SOQGSC
APIGSC
SESGSC
SOQASC
SCAASC
SOQESC
36 0 0 0 0 0 0 0 0 0 0
37 0 0 0 0 0 0 0 0 0
38 0 0 0 0 0 0 0 0
39 0 0 0 0 0 0 0
40 0 0 43 0 0 48
41 0 0 0 0 0
API ESC
SCAESC
SOQMSC
APIMSC
SCAMSC
42 0 0 46 0
44 0 0 0
45 0 0
47 0
49
GENSC
ACADSC
10 11 13 16
12 14
THETA-OEL TA
SOQGSC APIGSC SESGSC SOQASC SCAASC SOQESC API ESC SCAESC SOQMSC APIMSC SCAMSC
THETA-OEL TA
API ESC SCAESC SOQMSC APIMSC SCAMSC
Table 8.7 LlSREL Input: Initial Test for Invariance of Individual Variance/Covariances lID lX=FU, FI PH=SY, FR TO=SY LK
GENSC ACAOSC ENGSC MATHSC FR lX(2,l) LX(3,l) lX(5,2) LX(7,3) LX(8,3) lX(10,4) LX(11,4) FR LX(6, 1) FR TO(8,5) TO(10,7) TO(11,5) VA 30_0 lX(l,l) lX(4,2) lX(6,3) LX(9,4) ST 15.0 lX(2,l) LX(3,l) lX(5,2) lX(7,3) lX(8,3) LX(10,4) lX(11,4) ST 5.0 lX(6,l) ST 1.0 PH( 1, 1) PH(2,2) PH(3,3) PH(4,4) ST .05 PH(2,l) PH(3,l) PH(4,l) PH(3,2) PH(4,2) ST - .01 PH(4,3) ST 40.0 TO(1,l) TO(2,2) TO(4,4) TO(6,6) TO(7,7) TO(9,9) ST 10.0 TO(5,5) TO(8,8) TO(ll,ll) TO(10,7) ST 18.0 TO( 10, 10) ST 5.0 TO(8,5) TO(11,5) TO(3,3) EQ lX(l,2,l) lX(2,l) EQ lX(l,3,l) lX(3,l) EQ lX(l,5,2) lX(5,2) EQ lX(l,7,3) lX(7,3) EQ lX(l,8,3) lX(8,3) EQ lX(l,10,4) lX(10,4) EQ LX(l,ll,4) lX(11,4) EQ lX(l,6,l) lX(6,l) EQ PH(1, 1, 1) PH(1,1) (II NS
280
Chapter 8
Table 8.8 L1SREL Input: Final Test for Invariance of Individual Variances/Covariances EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ EQ
LX(1,2,1) LX(2,1) LX(1,3,1) LX(3,1) LX(1,5,2) LX(5,2) LX(1,7,3) LX(7,3) LX(1,8,3) LX(8,3) LX(1,10,4) LX(10,4) LX(1,11,4) LX(11,4) LX(1,6,1) LX(6,1) PH(',',1) PH(',1) PH(' ,3,3) PH(3,3) PH(' ,4,4) PH(4,4) PH(' ,2, ,) PH(2,') PH(' ,3, ,) PH(3,1) PH(' ,4,1) PH(4,1) PH(' ,3,2) PH(3,2) PH(' ,4,3) PH(4,3) OU NS
Table 8.9 SlIII1IIIr:r: of Tests for Invariance of Sel f·conceet Measurements and Structure Models
C~ting
N..mer of factors invariant
i
df
RMSEA
GFI
CFI
.33
.05
.97
.99
8
.32
.05
.97
.99
57.04** 18
.35
.06
.96
.98
Ai
Adf
ECVI
138.26
65
2 Model , wi th pattern of factor loadings held invariant a
145.37
73
3 Model 2 with all factor variances and covariances held invariant
195.30
83
146.52 164.24 148.47 149.32
74 1.15•• 75 17.72 75 3.10 76 3.95
1 2 2 3
.32 .34 .32 .31
.05 .05 .05 .05
.97 .96 .97 .97
.99 .99 .99 .99
154.46 156.16 157.08 157.09 190.97 171.40
77 78 79 80 81 81
9.09 10.79 11.71 11.72•• 45.60. 26.03
4 5 6 7 8 8
.32 .32 .32 .31 .35 .33
.05 .05 .05 .05 .06 .05
.97 .97 .97 .97 .96 .96
.99 .99 .99 .99
7.11
4 Model 2 with equivalent factor variances and and covariances held cUILIlatively invariant variances General SC Academic SC English SC Mathemat i cs SC Cova r i ances General/Academic SC General/English SC General/Mathematics SC Academi c/Eng Ii sh SC Academi c/Mathemat i cs SC Eng Ii sh/Mathemat i cs SC
*
12 lX(6,l) lX(8,2) lX(9,2) lX(10,2) FR lX(11,2) lX(13,3) lX(14,3) lX(15,3) lX(16,3) lX(18,4) lX(19,4) FR lX(20,4) lX(21,4) VA 1.0 lX(l,l) lX(7,2) lX(12,3) lX(17,4) ST _9 lX(2,1) lX(3,1) lX(4,1) lX(5,1) lX(6,1> lX(8,2) lX(9,2) lX(10,2) ST .9 lX(II,2) lX(13,3) lX(14,3) lX(1S,3) lX(16,3) lX(18,4) lX(19,4) ST _9 lX(20,4) lX(21,4) ST 1.0 PH( 1 ,1> PH(2,2) PH(3,3) PH(4,4) ST _4 PH(2,1) PH(3,1) PH(3,2) PH(4,l) PH(4,2) ST -.02 PH(4,3) ST _6 TO(I,l) TO(2,2) TO(3,3) TO(4,4) TO(S,S) TO(6,6) TO PH(2,2) PH(3,3) ST 2_0 PH(4,4) ST .5 PH(2,1> PH(3,l) PH(3,2) PH(4,2) ST .2 PH(4,1) ST _02 PH(4,3) ST .6 TOO,1) T0(2,2) TD(3,3) TO(4,4) TO(5,5) TO(6,6) TD(7,7) TO(8,8) ST .6 TO(9,9) TO(10,10) TD(11,11) TD(12,12) TD(13,13) TD(14,14) TD(15,15) ST .6 TO(16,16) TO(17,17) TD(18,18) TO(19,19 TO(20,20) TO(21,21) OU NS
n
The goodness-of-fit for this multigroup model (Model 1), in which no equality constraints have been imposed, yielded a reasonable fit to the data (X 2(366) - 1230.64; RMSEA - .07; GFI - .89; CFI- .92).1 On the basis of these results, we can conclude that, based on data from both low and high academic tracks, the SDQ-ill is best described by a four-factor model. Nonetheless, as noted in chapter 8, this finding in no way guarantees that the pattern and size of factor loadings is necessarily equivalent across groups. Thus, we turn next to the testing of this second hypothesis.
lRecall that in multigroup models the overall model estimated separately.
·l value equals the sum of the ·l s for each
294
Chapter 9
Hypothesis II: Testing for the Invariance of Factor Loadings L1SRfL Input file In testing this hypothesis, we respecify Modell with equality constraints placed on all factor loadings, except those which have been fixed to a value of 1.0 for purposes of statistical identification and scaling. The input file for this model {Model2} is presented in Table 9.2; however, only specifications related to the MO line, and all subsequent lines relative to the high track group are shown. Of particular note in this file is the specification of LX = IN on the MO card, thereby indicating that all estimated parameters in the entire factor loading matrix (Ax) are to be constrained; relatedly, the specification of all A. parameters (fixed and free), and their accompanying start values, have been deleted for Group 2 (the high track}.2 Results from this test of Hypothesis II determined that the postulated equality of factor loadings across track was tenable (~x.2(17) = 23.91); the nonsignificant ~x.2 representing the difference between Models 1 and 2. (Goodness-of-fit statistics pertinent to Model 2, as well as all subsequent models, are summarized in Tables 9.7 and 9.8.) From this information, we can conclude that all items comprising the four selected subs cales of the SDQ-ill are measuring the same SC facet in exactly the same way for both groups. In other words, one can be satisfied that the content comprising each SDQ-ill item is being interpreted in the same way by high school students in both the low and high academic tracks. Of course, had the equality of the factor loading matrix NOT been tenable, we would then have needed to test for the invariance of each of its individual parameters.
Hypothesis III: Testing for Invariant Factor Covariances Testing for the invariance of factor covariances bears on the group equivalence of SC relations as measured by the four SDQ-ill subscales. We test this hypothesis by imposing equality constraints on only the covariances within 2It is worth noting, however, that had these specifications been retained in the file, the imposition of a constrained A matrix would have led to the program overriding these specifications.
Application
7
295
Table 9.2 L15REL Input: Testing for the Invariance of Factor Loadings MO 5T 5T 5T 5T 5T 5T 5T 5T
au
LX=IN PH=5Y, FR TO=5Y 1.0 PH( 1,1) PH(2,2) PH(3,3) 2.0 PH(4,4) .5 PH(2,1) PH(3,1) PH(3,2) PH(4,2) .2 PH(4,1) .02 PH(4,3) .6 TO(1,1) TO(2,2) TO(3,3) TO(4,4) TO(5,5) TO(6,6) TO(7,7) TO(8,8) .6 TO(9,9) TO(10,10) TO(11,11) TO(12,12) TO(13,13) TO(14,14) TO(15,15) .6 TO(16,16) TO(17,17) TO(18,18) TO(19,19 TO(20,20) TO(21,21) N5
the matrix. Consistent with the strategy outlined in chapter 8, whereby each successive model is more restrictive than the former, Hypothesis ill is tested by specifying a model in which the factor loading matrix remains invariant and all factor covariances are constrained to be equal. The relevant portion of the input file for this model (Model 3) is presented in Table 9.3. In reviewing this input file, the most noteworthy point to be made is that because we are only interested in structural relations among the four SC facets, and not in the invariance of the factor variances, the equality constraint for each covariance must be specified using the EQ keyword. In testing Hypothesis ill, we compare the fit of Model 3 with that of Modell; this comparison yielded a /::;'°l(23) value of 30.33, which is not statistically significant and, thus, Hypothesis III remains tenable. Presented with these findings, we can feel confident that SC structure, as measured by the SDQ-ill in this example, is the same for both low and high academic tracks. (By way of comparison, recall the differences found in SC structure across gender, as illustrated in chapter 8.)
Table 9.3 L15REL Input: Testing for the Invariance of Factor Covariances MO 5T 5T 5T 5T 5T 5T 5T 5T EQ EQ EQ EQ EQ EQ
au
LX= I N PH=5Y, FR TO=5Y 1.0 PH(1,1) PH(2,2) PH(3,3) 2.0 PH(4,4) .5 PH(2,1) PH(3,1) PH(3,2) PH(4,2) .2 PH(4,1) .02 PH(4,3) .6 TO(1,1) TO(2,2) TO(3,3) TO(4,4) TO(5,5) TO(6,6) TO(7,7) TO(8,8) .6 TO(9,9) TO(10,10) TO(11,11) TO(12,12) TO(13,13) TO(14,14) TO(15,15) .6 TO(16,16) TO(17,17) TO(18,18) TO(19,19 TO(20,20) TO(21,21) PH(1,2,1) PH(2,1) PH(1,3,1) PH(3,1) PH(1,4,1) PH(4,1) PH(1,3,2) PH(3,2) PH(1,4,2) PH(4,2) PH(1,4,3) PH(4,3) N5
296
Chapter 9
Hypothesis IV: Testing for Invariant Item-Pair Reliabilities Generally speaking, in multiple indicator CF A models, testing for the invariance of reliability is neither necessary ijoreskog, 1971a), nor of particular interest when the scales are used merely as CF A indicators and not as measures in their own right (B. O. Muthen, personal communication, October 1987). Although Joreskog (1971b) demonstrated the steps involved in testing for a completely equivalent model across groups (i.e., invariant A, , 0), this model is considered to be an excessively stringent test of factorial invariance (Muthen, personal communication, January 1987). Indeed, Joreskog (1971a) has shown that although it is necessary that multiple measures of a latent construct be congeneric (i.e., believed to measure the same latent construct), they need not exhibit invariant variances and errorl uniquenesses. On the other hand, when the multiple indicators of a CF A model represent items from a single measuring instrument, it may be of interest to test for the equivalence of item reliabilities as a means to detecting evidence of item bias (see Benson, 1987). In contrast to the conceptual definition of item bias generally associated with cognitive instruments (i.e., individuals of equal ability have unequal probability of success), item bias related to affective instruments reflects on their validity and, hence, on the question of whether items generate the same meaning across groups. Evidence of such item bias is a clear indication that the scores are differentially valid (Green, 1975). From classical test theory, item reliability is defined as the ratio of true score variance to total score variance, the latter being the sum of true score and error score variances. Translated into LISREL lexicon, this formulation is represented by ~/(~ + e~), where ~ represents factor true score variance and e~ represents error score variance associated with measures of the factor. For example, in the application under discussion in this chapter, ~11/(~11 + e61-06) represents the ratio of true score variance to total score variance for general SC (Factor 1); total score variance represents the true score variance of general SC plus (+) error score variance associated with the. first six item-pair measurements of general SC. It follows from this, therefore, that the reliability of each observed measure in a LISREL-specified model is determined in part by the variance of its corresponding factor; tqe reliability
Application 7
297
ratio thus becomes j}~/(A.2~ + 9) Qoreskog, 1971b). Again, within the framework of the present data, A.231~11/(A.231~11 + 9633) would represent the reliability of the third item-pair designed to measure general Sc. In examining test reliability, it is important to know if the factor variances are equivalent across groups. In the event that they are, then the equality of item reliabilities is tested by constraining the related factor loadings (A.s), error terms (os), and variances (~s) across groups (Cole & Maxwell, 1985; Rock, Werts, & Flaugher, 1978). If, on the other hand, the factor variances are nonequivalent across groups, then testing for reliability invariance must be based on the ratio of true and error score variances (Cole & Maxwell, 1985; Rock et al., 1978). This latter procedure, however, is quite complex and has not been fully demonstrated in the literature. Although Werts et al. (1976) addressed the testing of a ratio of variances, they did not provide an explicit application related to tests for the invariance of the reliability ratio. With this information in hand, let us proceed with tests for the invariance of item-pair reliability based on our two academic tracks. Adhering to the caveats noted earlier, we begin by testing for the equivalence of factor variances in order to establish the viability of imposing equality constraints on the A. and 96for each item pair. The related input file for this model (ModeI4) is shown in Table 9.4. Because we already know that the factor covariances are invariant across tracks (see Model 3), we are able to constrain the entire Phi matrix to be invariant. Thus on the MO line, you will note the specification PH=IN.3 Results from this test yielded a statistically significant difference in fit - 57.94, P < .001); the hypothesis between this model and Modell (~X2(27) of equivalent factor variances must therefore be rejected. Presented with these findings, of course, we must next determine which factor variances, if any, are invariant across groups. As a consequence, we must proceed in testing for the equality of each factor variance inde-
Table 9.4 L1SREl Input: Testing for the Invariance of Factor Variances and Covariances Me LX:IN PH:IN TO=SY
ST .6 TD(l,1) TD(2,2) TD(3,3) TD(4,4) TD(S,S) TD(6,6) TO(7,7) TO(S,S) ST .6 TD(9,9) TD(10,10) TO(ll,H) TO(12,12) TD(H,13) TD(14,14) TO(15,15) Sf .6 fO(16,16) fO(17,l7) TO(18,18) TD(19,19 TO(20,20) TD(2l,21)
au
NS
3Alternatively, we could have retained the equality constraints for each of the factor covariances and simply added those for the four factor variances. As such, the specification of PH-SY,FR would remain unchanged.
298
Chapter 9
pendently, albeit cumulatively retaining any that are found to be group-invariant as was the case in chapter 8. Assessment of these tests was determined by comparing each of the four models with Model 3 in which only the factor covariances were constrained equal. Results revealed only the variance of 2 English SC (+33) to be invariant across academic track (AX (1) - 0.08); results related to the other SC factors are reported in Table 9.6. For purposes of illustration, and because the variance of only the English SC factor was found to be invariant across track, we proceed now to test for evidence of equivalent reliability of the item pairs comprising this subscale. As with the factor variances, we begin by imposing equality constraints on all five error terms associated with the factor loadings. The input file related to this initial test of error terms is shown in Table 9.5.
LlSREL Input: Testing for the Invariance of Item-pair Reliabilities for the English Self-concept Sub MO ST ST ST ST ST ST ST ST EQ EQ EO EO EO EO EO EO EO EO EQ
au
LX= I N PH=SY, FR TO=SY LO PH(1,1) PH(2.2) PH(3,3) 2_0 PH(4,4) _5 PH(2,1) PH(3,1) PH(3,2) PH(4,2) _2 PH(4,1) .02 PH(4,3) .6 TO(1,1> TD(2,2) TD(3,3) TD(4,4) TD(5,5) TD(6,6) TDer of factors invar i ant 1230.64
366 383
5 Model 3 wi th factor I/ariances held independently invariant General se Academic se Engl ish SC Mathemat i cs se
*e
1'\1
SELf. ESTEEM
~
EXTERNAL LOCUS OF CONTROL 112
FIG. 11.1. Final Model of Burnout for Calibration Sample of High School Teachers .
~7
PEER SUPPORT
~S
DECISION MAKING
~4
CLASSROOM CLIMATE
~~
OVERLOAD
WORK
ROLE CONFLICT
1\3
EMOTIONAL EXHAUSTION
115
PERSONAL ACCOMPLISHMENT
1\4
DEPERSON· ALIZATION
330
Chapter 11
statistics, which are derived multivariately. It is imponant that I draw your attention to this fact because, in the present chapter, these same tests for invariance are based on a univariate approach using LISREL 8. As a result, findings from the current set of analyses can deviate slightly from those reponed for the original study. Indeed, we found this to be the case in chapter 7. Based on the LM "l statistics, then, tests of the initially hypothesized model revealed a modest amount of misspecification. In particular, these coefficients suggested the addition of three structural paths to the model (classroom climate ~ depersonalization; role conflict ~ external locus of control; self-esteem ~ external locus of control), whereas the multivariate Wald test (see chapter 7) identified four paths that could be deleted from the model with no deterioration of fit. This final model of burnout structure for high school teachers yielded a X2(441) - 1514.09, and a CFI = .91. (Although the LMTest identified one additional parameter that, if included in the model, would render a substantial improvement in fit-Emotional Exhaustion ~ Self-Esteem-this specification was not incorporated; to do so would have been substantively illogical within the contextual focus of the original study.) The task before us now, therefore, is to determine if this final model replicates over Sample B, the validation group of high school teachers.
Testing for Invariance Across CalibrationNalidation Samples
L1SREL Input File Following the same procedure that was illustrated in chapters 8 and 9, we begin by first establishing a multi group baseline model against which we can compare subsequent models that include equality constraints. However, unlike the previous two multigroup models in which each group had its own specific baseline model, the one presented here is specific only to the calibration group. In other words, specifications describing the final model for the Calibration sample are similarly specified for the Validation sample. The input file forthis baseline multigroup model is shown in Table 11.1. In reviewing this file, you will see that model specification, and even the stan values, are identical for both groups. This, of course, is so because the primary purpose of a cross-validation application of structural equation modeling is to determine if the final model that was derived from exploratory
Application 9
331
Table 11.1 L1SREL Input for Simul taneous Estimation of Cal ibration and Val idation Sanples of High School Teachers ! Cross-val idation of Final Model of Burnout for High School Teachers "BURNHSCV1" ! GROUP 1 = Calibration Sample ! Simultaneous Model OA NG=2 NI =32 NO= 715 MA=CM LA ROLEAl ROLEA2 ROLECl ROLEC2 IIORKl IIORK2 CLlMATEl CLIMATE2 CLIMATE3 CLIMATE4 OECl OEC2 SSUPl SSUP2 PSUPl PSUP2 SELFl SELF2 SELF3 XLOCl XLOC2 XLOC3 XLOC4 XLOC5 EEl EE2 EE3 OPl OP2 PAl PA2 PA3 RA=SECIN01L.OAT FO (19F4.2/13F4.2) SELECTION 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16/ MO NY=16 NX=16 NE=5 NK=7 LX=FU,FI LY=FU,FI BE=FU,FI GA=FU,FI PH=SY,FI PS=SY,FI TE=OI,FR TO=OI,FR LE SELF XLOCUS EE OP PA LK ROLEA ROLEC IIORK CLIMATE DECISION SSUPPORT PSUPPORT FR LYC2,l) LY(3,1) LY(5,2) LY(6,2) LY(7,2) LY(8,2) LY(10,3) LY(11,3) LY(13,4) LY(15,5) LY(16,5) FR LX(2,1) LX(4,2) LX(6,3) LX(8,4) LX(9,4) LX(10,4) LX(12,5) LX(14,6) LX(16,7) FR LX(l,?) LX(3,5) LX(4,6) LX(ll,1) FR PS(l,l) PS(2,2) PS(3,3) PS(4,4) PS(5,5) FR PH(l,l) PH(2,2) PH(3,3) PH(4,4) PH(5,5) PH(6,6) PH(7,?) FR PH(2, 1) PH(3,1) PH(3,2) PH(6,5) PH(7,5) PH(7,6) FR GA(1,5) GA(1,?) GA(2,2) GA(2,5) GA(3,3) GA(3,4) GA(4,2) GA(4,4) FR BE(2,1) BE(4,3) BE(5,1) BE(5,2) BE(5,4) VA 1.0 Ly(l,l) LY(4,2) LY(9,3) LY(12,4) LY(14,5) ST .8 LY(2,1) LY(3,1) LY(5,2) LY(6,2) LY(7,2) LY(8,2) LY(15,5) ST .9 LY(11,3) LY(13,4) LY(16,S) ST 1.0 LY(10,3) VA 1.0 LX(l,l) LX(3,2) LX(S,3) LX(7,4) LX(11,5) LX(13,6) LX(15,7) ST .6 LX(8,4) LX(9,4) LX(10,4) ST .9 LX(6,3) ST 1.0 LX(2,1) LX(12,5) LX(14,6) LX(16,7) ST 1.7 LX(4,2) ST -.16 LX( 1,7) ST -.4 LX(3,5) ST - .07 LX(4,6) ST -.6 LX(11,1) ST .3 PH(l, 1) PH(2,2) PH(4,4) PH(5,5) PH(7,7) ST .8 PH(3,3) PH(6,6) ST .3 PH(2,1) PH(3,l) PH(3,2) PH(6,5) PH(7,6) ST .1 PH(7,5) ST -.1 GA(l,S) GA(2,S) ST -.2 GA(3,4) GA(4,4) ST .2 GA(2,2) GA(4,2) GA(l,7) ST .6 GA(3,3) ST -.2 BE(2,1) BE(5,2) ST -.4 BE(5,4) ST .3 BE(5,1) BE(4,3) ST .1 PS(l,l)-PS(S,S) ST .2 T0(1,1)-TO(16,16) ST .2 TE(l,l)-TE(16, 16) OU NS
work based on one sample can replicate across a second sample. Note also that the data for both groups are in raw matrix form, with the fixed FORTRAN statement included in the file. As pointed out in chapter 7, when working with full structural equation models using the LISREL program, it is essential that the dependent variables be entered first. Using both Fig. 11.1 and the input file (Table 11.1) in
332
Chapter 11
Table 11.1 cont. I GROUP 2 = Val idation Sample DA NO=714 LA ROLEAl ROLEA2 ROLECl ROLEC2 IIORK1 IIORK2 CLIMATE1 CLIMATE2 CLIMATE3 ClIMATE4 DECl DEC2 SSUPl SSUP2 PSUPl PSUP2 SELFl SELF2 SElF3 XlOC1 XlOC2 XLOC3 XLOC4 XLOCS EEl EE2 EE3 DPl DP2 PA 1 PA2 PA3 RA=SECIND2L.DAT FO (19F4.2/13F4.2) SELECTION 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16/ MO NY=16 NX=16 NE=5 NK=7 LX=FU,FI LY=FU,FI BE=IN GA=IN PH=IN PS=SY,FI TE=DI,FR TD=DI,FR LE SEL F XLOCUS EE DP PA LK ROLEA ROLEC IIORK CLIMATE DECISION SSUPPORT PSUPPORT FR LY(2,1> LY(3,1> LY(5,2) LY(6,2) LY(7,2) LY(8,2) LY(10,3) LY(11,3) lY(13,4) LY(lS,S) LY(16,S) FR LX(2,l) lX(4,2) lX(6,3) lX(8,4) lX(9,4) lX(10,4) LX(12,5) lX(14,6) lX(16,7) FR LX(l,7) LX(3,S) LX(4,6) LX(11,l) FR PS(l,l) PS(2,2) PS(3,3) PS(4,4) PS(5,S) FR PH(l,l) PH(2,2) PH(3,3) PH(4,4) PH(S,S) PH(6,6) PH(7,7) FR PH(2,1> PH(3,l) PH(3,2) PH(6,S) PH(7,S) PH(7,6) FR GA(l,S) GA(l,7) GA(2,2) GA(2,S) GA(3,3) GA(3,4) GA(4,2) GA(4,4) FR BE(2,1> BE(4,3) BE(S,1> BE(S,2) BE(S,4) VA 1.0 LY(l,l) LY(4,2) LY(9,3) LY(12,4) LY(14,5) ST .8 LY(2,l) LY(3,1> LY(5,2) LY(6,2) LY(7,2) LY(8,2) LY(lS,5) ST .9 LY(11,3) LY(13,4) LY(16,5) ST 1.0 LY(10,3) VA 1.0 LX(l,l) LX(3,2) LX(5,3) LX(7,4) LX(11,5) LX(13,6) LX(15,7) ST .6 LX(8,4) LX(9,4) LX(10,4) ST .9 LX(6,3) ST 1.0 LX(2,l) LX(12,5) LX(14,6) LX(16,7) ST 1.7 LX(4,2) ST '.16 LX(l,7) ST - _4 lX(3,5) ST -_07 LX(4,6) ST -.6 LX(ll,1> ST _3 PH(l,l) PH(2,2) PH(4,4) PH(5,5) PH(7,7) ST .8 PH(3,3) PH(6,6) ST .3 PH(2,l) PH(3,l) PH(3,2) PH(6,5) PH(7,6) ST _1 PH(7,5) ST -.1 GA(l,5) GA(2,S) ST -.2 GA(3,4) GA(4,4) ST _2 GA(2,2) GA(4,2) GA(l,7) ST .6 GA(3,3) ST -.2 BE(2,l) BE(5,2) ST -_4 BE(5,4) ST .3 BE(S,l) BE(4,3) ST .1 PS(l,l)-PS(5,5) ST _2 TD(l,l)-TD(16,16) ST .2 TE(l,l)-TE(16,16) au NS
combination, you will see that the variables to be read first (17 ~ 32) are the observed measures for the dependent factors in the model (the llS: Self-Esteem, External Locus of Control, Emotional Exhaustion, Depersonalization, Personal Accomplishment), followed by those measuring the independent factors (the ~s). Recall, also, that the reordering of variables is made possible using the SELECTION command. Of particular note in this regard is the slash (I) at the end of the reordered variables. If this slash is not present, the omission triggers the following error message:
Application 9
333
SELECTION 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 MO NY -16 NX-16 NE-5 NK-7 LX-FU,FI LY -FU,FI BE-IN GA-IN PH-IN PS~SY,FI TE-DI,FR TD-DI,FR WAR N I N G: Variable "MO" is not defined. Given that the final best-fitting model under scrutiny in this chapter, for high school teachers, derived from the originally hypothesized model (for all three panels of teachers) specified in chapter 7, a detailed explanation of model input is not presented here. However, should you seek further clarification of model specification, as presented in the MO line, it may be helpful to review chapter 7 regarding this aspect of the input file. We turn now to the results of this analysis.
LlSREL Output file Given the complexity of the full model specification (i.e., measurement and structural models), it is worthwhile to review the parameter specification section of the output before moving on to the goodness-of-fit statistics. Of particular import are the structural parameter specifications and these are shown in Table 11.2. I consider it vital for you to see clearly that the parameter specifications shown in Table 11.2 accurately reflect the structure portrayed in Fig. 11.1. Although you could, of course, also check out compatibility by working your way through the input file, a review of the parameter specification summary from the output file is much less tedious. Turning to this summary, therefore, we see that there are 13 structural paths-eight "is and five ps. In interpreting these matrices, we read from the variable listed at the top of the matrix, to the variable listed to the left of the matrix. For example, in the first column of the Beta matrix, parameter 25 represents a path flowing from Self-Esteem to Locus of Control; in the Gamma matrix, parameter 31 represents a path from Peer Support to Self-Esteem. Let us turn now to the goodness-of-fit results. These indices reflect the simultaneous estimation of the final model of burnout for both the Calibration and Validation samples and are presented in Table 11.3. 2 As noted in chapter 8, the 'l value represents the sum of the X test statistic relative to each sample together with the sum of the degrees of freedom; as
334
Chapter 11
Table 11.2 Parameter SPecifications Related to Burnout Structure for High School Teachers
BETA
SELF XlDCUS EE DP PA
SELF
XlOCUS
EE
DP
PA
D 25
0 0 0 0
D
0
0 0 0 0
29
0 0 0 0 0
WORK
CLIMATE
DECISION
SSUPPORT
0
0 0 35 37 0
30 33 0 0
0 0
0 0
27
28
ROlEA
ROlEC
0 0 0 0 0
0 32 0
0 0
26
CWIIA
SELF XlOCUS EE DP PA
36 0
0
34 0
0
0
0
0
0
CWIIA
PSUPPORT SELF XLOCUS EE DP PA
31
0 0 0 0
Table 11.3 LISREL OUtput: Goodness of Fit Statistics for Estimated Model without Equality Constraints GOIIIIESS OF FIT STATISTICS CHI-SQUARE IIITH 882 DEGREES OF FREEDOM = 2946.80 (P = 0.0) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 2064_80 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RI'ISEA)
z
0 _057
EXPECTED CROSS-VALIDATION INDEX (ECVI) = 2.31 STANDARDIZED RMR = 0.15 GOODNESS OF FIT INDEX (GF!) = 0.89 COMPARATIVE FIT INDEX (CF!) = 0.91 CONFIDENCE LIMITS COULD'NOT BE COMPUTED DUE TO TOO SMALL P-VALUE FOR CHI-SQUARE
can be seen in Table 11.3, this value is 2,946.80 with 882 degrees of freedom. Because it is almost impossible to judge the adequacy of multi group model fit based on this value, we turn to the GFI and CFI values, which indicate a marginally adequate fit to the data representing both groups of high school teachers. Not surprisingly, the same misspecified parameter noted earlier (Emotional Exhaustion ~ Self-Esteem) was also evident for the Validation sample. It is clear that had this parameter been freely estimated for both
Application 9
335
groups, the overall model fit would have been improved substantially. However, in addition to its aforementioned implausibility, the incorporation of this parameter would have been inappropriate under the present test of cross-validation. Thus, we use the present model fit as a yardstick against which to determine the tenability of imposed equality constraints; we turn next to the specification of these constraints.
L1SREL Input Although the major focus of cross-validation in the original study (Byrne, 1994a) was directed toward the "causes" of teacher burnout and therefore constraints were placed only on the structural paths, we apply a more stringent test of invariance here. As such, we impose equality constraints on both the factor loadings (including cross-loadings) and the structural paths across calibration and validation groups. The relevant section of the related input file is shown in Table 11.4. Two important factors are worthy of note with respect to this input file. First, because the entire measurement model is hypothesized to be equivalent across the two groups, both the LY and LX factor matrices are specified as invariant (IN) on the MO line. Relatedly, the specification of these parameters as free, and the specification of their start values, are deleted from the input file. In my experience, I have found that if these deletions are not made, the program will ignore the LY - IN specification. Second, given that all structural paths shown in Fig. 11.1 are being tested for their invariance across the two groups, it would seem logical that both the Beta and Gamma matrices could be similarly specified as on the MO line (i.e., BE-IN; GA - IN); again, however, it has been my experience that such a specification results in no constraints being imposed at all. Rather, equality constraints pertinent to particular elements of these matrices must be specified individually as shown in Table 11.4.
L1SREL Output Results from the estimation of this highly restrictive multigroup model yielded a X2 value of 2,995.89 with 919 degrees of freedom; not unexpectedly, the GFI and CFI values remained unchanged at .89 and .91, respectively. To assess the tenability of these equality constraints, we compare this model with the previous one in which no constraints were imposed. Accordingly, this comparison yielded a I1X2(37) - 49.09, which is statistically nonsignificant.
Chapter 11
336
Table 11.4 L1SREL Input: Testing for the Invariance of Factor loadings and Structural Paths ~~
NY=16 NX=16 NE=5 NIC=7 lX=IN lY=IN BE=FU,F1 GA=FU,FI PH=SY,FI PS=SY,FI TE=DI,FR TD=DI,FR
SELF XLOCUS EE DP PA llC ROlEA ROlEC WORK CLIMATE DECISION SSUPPORT PSUPPORT VA 1.0 LY(1,1) LY(4,2) LY(9,3) LY(12,4) LY(14,5) Sf .8 LY(2,1) LY(3,1) lY(5,2) lY(6,2) LY(7,2) LY(8,2) LY(15,5)
EQ EQ EQ EO EO EO EO EO EQ EQ EQ EO EO OU
BE(1,2,1) BE(1,5,1) BE(l,5,2) BE(l,4,3) BE(1,5,4) GA(1,l,5) GA(1,l,7l GA(1,2,2) GA(l,2,5) GA(1,3,3) GA(l,3,4) GA(1,4,2) GA(1,4,4) NS
BE(2,1) BE(5,1) BE(5,2) BE(4,3) BE(5,4) GA(l,5) GA(l,7l GA(2,2) GA(2,5) GA(3,3) GA(3,4) GA(4,2) GA(4,4)
Thus, we conclude that the model, as presented in Fig. 11.1, can be considered invariant across the calibration and validation samples, thereby supporting its replication for high school teachers. In testing for the invariance of a theoretical construct across gender in chapter 8, we determined that the postulated invariance of the entire Phi matrix was not tenable. To pinpoint the noninvariant elements in the matrix, we subsequently tested for the equality of each across the two groups. However, another approach that can be taken to identify the noninvariant parameters is to review the modification indices, along with the related expected parameter change statistics specific to only those parameters for which equality constraints were imposed. Thus, as a matter of interest only, let us examine this portion of the output file as it relates only to the structural parameters; this information is presented in Table 11.5. In reviewing the modification indices in Table 11.5, you will note some very large values (e.g., 94.53, 64.52, and so on). However, these indices relate to parameters that were fixed to zero in the originally hypothesized model. Although they represent some degree of misspecification in the model, their presence can likely be explained by the differential approach to the estimation of modification indices performed by LISREL versus EQS, as noted
337
Application 9
Table 11.5 LlSREL OUtPUt: Modification Indices Related to Structural Parameters for Val i dation Sanple IIII)IFlCATlCII III)ICES Fill lETA
SELf XLOCUS EE DP PA
SELf
XLOCUS
EE
DP
PA
1.31 0. 09 56.40 8 .02 0.18
7.89 1.69 5.03 0.12 0.54
94.53 2.05 0.02 5.51 8.31
64.52 1.37 1.59 8 .02 2. 55
3.68 0.00 20.08 2.57 2.18
EXPECTED CHAllGE Fill BETA
SELF XLOCUS EE DP PA
SELF
XLOCUS
EE
DP
PA
·0.19 ·0.01 ·0.85 -0.29 0.03
·0.14 0.17 0.24 -0.03 -0.04
·0.12 0.02 0.01 -0.05
·0.13 0.02 ·0.08 -0.24 -0.04
0.05 0.00 -0.25 0.09 0.17
l1li) I FICATlCII
SELF XLOCUS EE DP PA
-ITs
III) I CES fill GNIUI
ROLEA
ROLEe
WORK
CLIMATE
DECISION
SSUPPORT
30 . 12 1.50 0.11 0.02 1.78
43 . 02 2.03 0.02 Q.&1 0.03
42.50 1.48
15.64 12.04
1.43 0. 01
6.28 17.17
l!....1Q
0.71 0.43 3.76 0.30 42.04
1.40 0.13 0.86 0.73 35 . 54
l1li) I FICATlCII
.!l..M
III) I CES Fill GNIUI
PSUPPORT SELf XlOCUS EE DP PA
1.55 1.86 1.21 2.20 13.39 EXPECTED CHAIIGE fill GMM
SELF XLOCUS EE DP PA
ROLEA
ROLEC
IoIORIC
CLIMATE
DECISION
SSUPPORT
-0.14 0.05 -0.03 0.01 -0.07
-0.15 0.03 -0.01 -0.03 -0.01
-0.10 0.03 0.01 -0.06 0.00
0.18 -0 . 19 0.03 0.20 0.42
·0.02 -0.02 -0.16 ·0.04 0.42
-0 . 02 -0 . 01 ·0 . 03 ·0 . 03 0. 16
EXPECTED CHAllGE fill GNIUI PSUPPORT SELF XLOCUS EE DP PA
-0.02 -0.03 -0.06 -0 . 07 0.15
337
338
Chapter 11 1
earlier in chapter 7. One exception to these differential modification indices, however, is the one representing the impact of Emotional Exhaustion (EE) on Self-Esteem (Self), which was similarly found in the analyses from both programs. However, within the context of the original study, the free estimation of this path was not substantively meaningful. For our purposes here, however, we focus solely on statistics related to the constrained parameters; these have been underlined in Table 11.5. Had our test for invariance been found to be untenable, we could then have examined these modifications with a view to identifying which constraint, if released, would lead to a reestimated specification of constraints that were tenable. In reviewing the underlined values in both the Beta and Gamma matrices, we see that the largest underlined modification index value is 6.68 and represents the path from Classroom Climate to Depersonalization (DP). Although the value of 5.51 in the Beta matrix is large, note that its Expected Parameter Change statistic is substantially less than the one associated with the Climate ~ DP path in the Gamma matrix (-.05 vs .. 20). In fact, the Expected Parameter change statistic associated with this Gamma parameter is distinctly larger than all others associated with constrained parameters. On the basis of these findings, then, had it been necessary to release one or more equality constraints, I would have initially released the constraint on this parameter (GA[4,4]). In sum, the test of invariance illustrated in this application represented an excessively rigid test of cross-validation. Indeed, Bollen (1989b) has indicated that if a researcher's multigroup focus is directed more toward the equality of structural-rather than measurement-parameters, then testing for the invariance of the former may precede the latter; this was the procedure followed in the original study (Byrne, 1994a).
FROM THE SIMPLIS PERSPECTIVE The SIMPLIS input file presented in this chapter is designed to test for the invariance of factor loadings and structural regression paths across calibration and validation samples of high school teachers; it is comparable to the
IModification of the originally hypothesized model was based on the LMTest statistics used in the EQS program.
Application 9
339
LISREL input file shown in Table 11.4. However, because only the constraint segment of the LISREL file is evident in Table 11.4, you will need also to review the multigroup model setup in Table 11.1 in order to follow the full translation from LISREL to SIMPLIS language. The SIMPLIS input file is shown in Table 11.6. Although structurally the specifications for the first group resemble the SIMPLIS files presented in the two previous chapters, I believe that it is worthwhile to review several aspects of this particular file. As noted previously, therefore, SIMPLIS assumes that the model specified for Group 1 holds in all subsequent groups unless otherwise indicated; as such, only differences are specified for groups other than Group 1. Moreover, the mere inclusion of the word "Group" triggers the program that specifications for another set of sample data are provided. Reviewing the SIMPLIS file in Table 11.6, it is worthwhile to highlight a few of its features as specified for Group 1, the Calibration sample. First, we see that the data are in raw data matrix form and reside as an external file labeled "SECINDIL.DAT." Note here that, in contrast to LISREL wherein one can add the fixed format statement to the input file, SIMPLIS requires that the format statement constitute the first line of the data set. Second, as with the SIMPLIS input file presented in chapter 8, the one shown here in Table 11.6 includes only a subset of variables from the data set. To avoid the same difficulties described in chapter 8, the REORDER command has been included. Third, appearing first under the EQUAnONS keyword are the measurement model specifications, followed by the structural relations. Finally, because the hypothesized model specified only 6 factor correlations among the independent latent factors a priori, the remaining 15 correlations in the factor variance-covariance matrix must be specified as being equal to zero. This requirement, of course, derives from the fact that SIMPLIS presumes all independent factors to be intercorrelated. Turning to the specifications for Group 2, the Validation sample, we see that these raw data reside in an external file labeled "SECIND2L.DAT." Recall that the only constraints that we wish to impose on Group 2 in this example apply to the factor loadings and the structural parameters. Thus, it is necessary to correct all other presumed intergroup equivalencies by the program. Accordingly, we specify the following parameters to be freely estimated: (a) the variances of the independent factors (ROLEA-PSUPPORT)i (b) the specified factor covariances (there are six as noted earlier};
Table 11.6 SIMPLIS Input: Testing for the Invariance of Factor Loadings and Structural Paths I Testing for Invariance of Causal Structure of Burnout "BURNSCV2.SUI" ! Model 2: Factor Loadings and Structural Paths Invariant Group 1: Calibration Sa""le CllSERVED VARIABLES: ROLEA1 ROLEA2 ROLEC1 ROLEC2 WORK1 WORK2 CliMATE1 CliMATE2 CLIMATE3 CLIMATE4 DEC1 DECZ SSUP1 SSUPZ PSUP1 PSUPZ SELF1 SELFZ SELF3 XLOC1 XLOC2 XLOC3 XLOC4 XLOC5 EE1 EEZ EE3 DP1 DPZ PA1 PA2 PA3 RAIl DATA frc. FILE SECIIIl1L_DAT SAMPLE SIZE: 715 RECIIDER VARIABLES: SELF1 SELF2 SELF3 XLOC1 XLOCZ XLOC3 XLOC4 XLOC5 EE1 EE2 EE3 DP1 DP2 PA1 PA2 PA3 ROLEA1 ROLEA2 ROLEC1 ROLEC2 WORK1 WORKZ CliMATE1 CLIMATE2 CliMATE3 CLIMATE4 DEC1 DEC2 SSUP1 SSUP2 PSUP1 PSUPZ LATENT VARIABLES: SELF XLOCUS EE DP PA ROLEA ROLEC IIORK CLIMATE DECISION SSUPPORT PSUPPORT EQlJAT1!11S: SELf1 = 1*SELF SELFZ = (.8)*SELF SELF3 = (.8)*SELF XLOC1 = 1*XLOCUS XLOCZ = (.8)*XLOCUS XLOC3 = (.8)*XLOCUS XLOC4 = (.8)*XLOCUS XLOC5 = (.8)*XLOCUS EE1 = 1*EE EEZ = (1)*EE EE3 = (. 9)*EE DP1 = 1*DP DPZ = (.9)*DP PA1 = 1*PA PAZ = (.8)*PA PAl = (. 9)*PA ROLEA1 = 1*ROLEA ROLEA2 = (1 )*ROLEA ROLEC1 = 1*ROLEC ROLECZ = (1. 7)*ROLEC WORK1 = 1*\/oRK \/oRK2 = (. 9)*IIORK CLIMATE1 = 1*CLlMATE CLIMATE2 = (.6)*CLlMATE CLIMATE3 = (.6)*CLlMATE CLIMATE4 = (.6)*CLlMATE DEC1 = 1*DECISION DEC2 = (1)*DECISION SSUP1 = 1*SSUPPORT SSUP2 = (1 )*SSUPPORT PSUP1 = 1*PSUPPORT PSUPZ = (1 )*PSUPPORT DEC1 = ( •• 6)*ROLEA ROLEC1 = (-.4)*DECISION ROLECZ = (- .07)*SSUPPORT ROLEA1 = (-.16)*PSUPPORT SELF = PSUPPORT DECISION XLOCUS = DECISION SELF ROLEe EE = \/oRK CLI MATE DP = EE CLIMATE ROLEC PA = DP SELF XLOCUS SET the covariance of ROLEA and CLIMATE to 0.0 SET the covariance of ROLEA and DECISION to 0.0 SET the covariance of ROLEA and SSUPPORT to 0.0 SET the covariance of ROLEA and PSUPPORT to 0.0 SET the covar i ance of ROLEC and CLI MATE to 0.0 SET the covariance of ROLEC and DECISION to 0.0 SET the covariance of ROLEC and SSUPPORT to 0.0 SET the covar i ance of ROLEC and PSUPPORT to 0.0
Application 9
341
Table 11.6 Cont'd SET the covariance of WORK and CLIMATE to 0.0 SET the covariance of WORK and DECISION to 0.0 SET the covariance of WORK and SSUPPORT to 0.0 SET the covari ance of WORK and PSUPPORT to 0.0 SET the covariance of CLIMATE and DECISION to 0.0 SET the covariance of CLIMATE and SSUPPORT to 0.0 SET the covariance of CLIMATE and PSUPPORT to 0.0 Grlq) 2: VALIDATI(. SNlPLE RAil DATA fro. fiLE SECI1Il2L.DAT SAMPLE SIZE: 714 LET the variances of ROLEA • PSUPPORT be free LET the covariance of ROLEA and ROLEC be free LET the covar i ance of ROLEA and WORK be free LET the covari ance of ROLEC and WORK be free LET the covari ance of DECISION and SSUPPORT be free LET the covari ance of DECISION and PSUPPORT be free LET the covariance of SSUPPORT and PSUPPORT be free LET the variance of SELF be free LET the variance of XLOCUS be free LET the vari ance of EE be free LET the variance of DP be free LET the variance of PA be free LET the error variances of SELF1· PA3 be free LET the error variances of ROLEA1 • PSUP2 be free LI SREL OUTPUT END OF PROBLEM
(c) the disturbance terms (i.e., error variances) associated with the five dependent latent factors (SELF-PA); and (d) the error variances associated with the observed variables. Finally, consistent with the previously specified SIMPLIS input files, we have requested that the output be structured in the LISREL format.
APPLICATIONS RELATED TO OTHER DISCIPLINES Business: Durvasula, S., Andrews,J. C., Lysonski, S., & Netemeyer, R. G. (1993). Assessing the cross-national applicability of consumer behavior models: A model of attitudes toward advertising in general. Journal of Consumer Research, 19, 626-636. Education: Eimers, M. T., & Pike, G. R. (1997). Minority and nonminority adjustment to college: Differences or similarities? Research in Higher Education, 38,77-97. Sociology: Grosset, J. M. (1997). Differences in educational outcomes of African American students from different socioeconomic backgrounds. Structural Equation Modeling: A Mul· tidisciplinary Journal, 4, 52-64.
PART
FOUR
LISREL, PRELIS, and SIMPLIS Through Windows
CHAPTER 12: APPLICATION 10 Testing/or Causal Predominance Using a Two- Wave Panel Model
CHAPTER
12
Application 10: Testingfor Causal Predominance Using a Two-Wave Panel Model
Keeping in step with the mass migration from old world mainframe and DOS environment computing to new world computing within the Microsoft Windows environment, Joreskog and Sorbom (1993c) have adapted both the LISREL 8 and PRELIS 2 programs for use within this framework. One of the key features of this adaptation is the capability of the LISREL program to produce multiple aspects of any path diagram, accompanied by the appropriate estimates, modifications indices (MIs), or expected parameter change values. The purpose of this final chapter, therefore, is to present you with at least a flavor of what to expect when you use the LISREL and PRELIS programs within the Windows environment. As a vehicle for demonstrating the Windows versions of LISREL 8 and PRELIS 2, our last application involves a two-wave panel study of high school adolescents that was designed to answer the question of whether self-concept "causes" academic achievement, or whether the reverse direction of flow is true; the example is taken from Byrne (1986). In addition to being able to show you the many faces of a Windows application, this particular example affords me the opportunity to illustrate the new Theta-Delta-Epsilon matrix, a feature that is unique to the LISREL 8 program. Specifically, this matrix allows for correlated errors across time; previously, estimation of these parameters was possible only through a reparameterization of the model into either an all-X or an all-Y model. 345
346
Chapter 12
USING THE WINDOWS VERSIONS OF LlSREL 8 AND PRELIS 2 As with other Windows applications, all that is needed to use both LISREL 8 and PRELIS 2 is to simply double click on the LISREL 8 icon, which appears in Fig. 12.1.
LISREL for Windows v8.14 FIG. 12.1. lISREL 8 Icon for Windows Version of the Program.
Once you click on the icon, you will be presented with a menu bar that appears at the top of the screen, as shown in Figure 12.2. By means of the menu bar, you are able to perform a number of different functions, such as retrieval of files; editing of files; customizing the menu bar; running analyses; displaying, specifying, and estimating path diagrams; and searching for help with respect to using the LISREL and PRELIS programs within the Windows environment. I now present a brief overview of each of these components of the main menu bar. C:\IISRElBW\SCMf.f10 M t \85 IIne s j
;:
FIG. 12.2. Main Menu Bar for lISREL 8/Windows.
File To activate the file menu, you simply click on File in the menu bar.1 You will then be presented with a series of choices with respect to either the creation of a new file (New), retrieval of an existing file (Open, Open Last I Although my preference would have been to capture the various menu selections for presentation in a figure, the program did not permit such replication.
20 on the OU line. The first command tells the program to set off the admissibility check, which also allows the program to exceed the default number of iterations; the second command allows for more than the default number of iterations, which is 20. The purpose of the admissibility check is to prevent the unnecessary running of the program through many iterations, with no useful results. More specifically, the admissibility check determines whether (a) the factor loading matrices have full column ranks, and no rows containing only zeros, and (b) all variance-covariance matrices are positive definite aoreskog & Sorbom, 1993a). n a model fails the admissibility check, this may be indicative of a poorly specified model. For example, it may be that the model estimated differs substantially from the one intended, or that the model strongly disagrees with the data. n, on the other hand, the model is the one intended, yet the admissibility check still fails after 20 iterations, it is possible to set the admissibility check to a criterion that allows for a higher number of iterations (e.g., AD-40). These explanations aside, there can be instances where the researcher knows that the solution is inadmissible, and yet still wishes to estimate the model as specified. For example, given that one of the variances in a variance-covariance matrix is specified as a fixed zero by intent, this specification will automatically trigger the nonadmissibility error message (see, e.g., chapter 6). In the present application, this error message derives from the fact that both the Phi and Psi matrices are nonpositive definite. In the case of the Phi matrix, these findings likely result from one variance being of lesser value than the covariances (ASC1 - 1.10); in the case ofthe PSI matrix, it results from the the presence of a negative variance (AA2 - -3.75). For pedagogical reasons, however, I wish to retain the current specification for purposes of demonstrating how and to what extent the model is misspecified. Thus, the model is once again estimated, but this time with the keyword AD=OFF added to the OU line.
356
Chapter 12
This reestimation of the model resulted in an admissible solution after only 26 iterations. Nonetheless, note in Table 12.3 that estimates in both the Phi and Psi matrices are identical to those reported in the intermediate solution shown in Table 12.2. Clicking on the Path Diagram button yielded the schematic representation of this initial model shown in Fig. 12.9; this type of path diagram, wherein each measurement and structural path is accompanied by either the fixed or estimated value, is referred to in the Model menu as the basic model. I should, perhaps, be a little more specific about the Path diagram button. Upon clicking this button, you are presented with the menu bar shown in Fig. 12.5. To obtain a hard copy of the figure displayed in the path diagram, you then click on the Print button, after which you are presented with two options--'Font and Print. The first option (Font) allows you to alter either the text font or the lexicon displayed in the path diagram; you can also alter the size of the font. Although there are 100 selections from which to choose, only 3 such examples are displayed in Fig. 12.10. The second option, of course, allows you to print out any path diagram that is actively displayed on screen.
Table 12.3 LlSREL outPUt: Estimates for PHI and PSI Matrices NLmber of I terat ions = 26 PHI
GSCl
ASCl
AAl
GSCl
10.90 (0.89) 12.22
ASCl
2.66 (0.22) 11.87
1.10 (0.12) 9.20
AA1
3.35 (0.70) 4.77
4.28 (0.37) 11.52
23.68 (2.49) 9.51
GSC2
ASC2
AA2
4.94 (0.53) 9.33
0.07 (0.04) 1.97
·3.75 (1.19) -3.16
PSI
Application 10
357 GSCC2
IGSCC1 1.00 1.00
GSCR1 07 IGSCR1~1.
1.07
1.00 1.00 GSC1
GSC2
1.19
GSCR2
52 ~.
ASCC1 IASCC1~
•97, 97 ASCC2
1.00 1.00
1.00 ASC2
ASC1 ~3.78
3.78
4.67
ASCB1
ASCB2
8.87'
IAAET1~1.00 AAET1
'1..67 1.00 2.00
IAAMA1
AA1
AA2
1.00
AAET21
1.58 AAMA2
FIG. 12.9. Basic Path Diagram for Hypothesized Model ShowingMaximum likelihood Estimates.
In my earlier description of the Path diagram menu bar (Fig. 12.5), I noted that by clicking on the Model and Kind buttons, you are able to select (a) a particular component of the model tested, and (b) particular statistical information recorded in the output. For example, suppose that we are interested in results bearing on the structural aspect of our hypothesized model. By clicking first on Model, and then on Structural model from the drop-down menu, we are presented with a path diagram that includes the structural paths related to the Gamma matrix, factor covariances related to the independent factors (the ~s), and the disturbance terms related to the dependent factors (the 11s); this model is shown in Fig. 12.11, along with the maximum likelihood estimates. I wish to draw your attention, in particular, to the disturbance term estimates. As presented in the path diagram, it appears that each estimate related to these parameters is a negative value. However, if you compare these values with those presented in Table 12.3, you will readily note that only the variance estimate related to AA2 is negative (-3.75). This ambiguity is clearly a function of programming related to the assignment of numbers that must be placed to the right of any symbols in a model; as such, it is likely that this problem will be resolved in subsequent versions of the program.
Chapter 12
358 ~-------------------------------------
....
I-lInl
FIG. 12.10. Example of Fonts from the PRINT Drop-down Menu.
Let us turn now to the goodness-of-fit indices that are obtained by clicking on the Kind button and selecting Fit statistics from the drop-down menu; they are shown in Table 12.4. In reviewing this information, and in particular, the GFI, AGFI, and CFI values of .79, .64, and .76, respectively, it is very evident that the model is most certainly misspecified.
Application 10
359
10.90
GSC2
GSC1
4.94
3.952
2.66
3.35
.1~ 1.10
ASC1
~.07
.07
.14 4.28
8";J.a 1) 7 23. 68
AA1
&-3.75
FIG. 12.11. Structural Path Diagram for Hypothesized Model Showing Maximum Likelihood Estimates.
For clues to possible areas of model misfit, let us now turn to the modification indices related to the Gamma, Psi, and Theta matrices which are presented in Table 12.5. In reviewing, first, the Gamma matrix, we can see two very large modification indices that are worthy of serious consideration. These values represent structural paths with the first one representing the impact of GSC1 on GSC2, and the second one, on the impact of AA1 on AA2. Indeed, the estimation of these parameters makes good intuitive sense; moreover, the GSC path is consistent with self-concept theory in substantiating the strong stability of general self-concept over time. Thus, the estimation of these parameters represents one possible model modification. Turning now to the Psi matrix, again we see a very large modification index representing the covariance between the disturbance terms for GSC2 and AA2. This parameter cannot be explained substantively; rather, it is indicative of other misspecification in the model. Finally, let us examine a new matrix produced in the LISREL 8 version of the program termed the Theta-Delta-Epsilon matrix. As the label implies, this matrix allows for the correlation of measurement error over time. Given the known possibility of memory carryover effects associated
Chapter 12
360
with measuring instruments, such correlated error parameters would appear to be perfectly reasonable. Indeed, a review of the diagonal elements in this matrix, which represent correlated errors across time for each measure in the model, shows the values of these parameters to be substantial; furthermore, the related EPC estimates are equally substantial. For sake of completeness, let us now view the path diagram presented in Fig. 12.12 which is produced by clicking first on the Model button, with selection of the Structural model, and second, clicking on the Kind button, with selection of Modification indices. This diagram partially reflects the information reported in Table 12.5; that is, it reports substantial modification indices related only to the structural paths across time and the disturbance terms associated with the Time 2 factors. Note, however, that only a
Table 12.4 LISREL Output: Goodness-of-fit Statistics for Hypothesized Model GIXIlNESS OF FIT STIITI STI CS CHI -SQUARE WITH 45 DEGREES OF FREEDOM = 1702_43 (P = 0.0) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 1657_43 MINIMUM FIT FUNCTION VALUE = 1_83 POPULATION DISCREPANCY FUNCTION VALUE (FO) = 1.79 ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.20 EXPECTED CROSS-VALIDATION INDEX (ECV!) = 1.91 ECVI FOR SATURATED MODEL = 0.17 ECVI FOR INDEPENDENCE MODEL = 7.68 CIIl-SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM = 7099.45 INDEPENDENCE AIC = 7123.45 MODEL AIC = 1768.43 SATURATED Ale = 156.00 INDEPENDENCE CAlC = 7193.46 MODEL CAlC = 1960.95 SATURATED CAlC = 611.06 ROOT MEAN SQUARE RESIDUAL (RMR) = 5.45 STANDARDIZED RMR = 0.091 GOOONESS OF FIT INDEX (GFI) = 0.79 ADJUSTED GOOONESS OF FIT INDEX (AGF!) = 0.64 PARSIMONY GOOONESS OF FIT INDEX (PGF!) = 0.46 NORMED FIT INDEX (NFl) = 0.76 NON-NORMED FIT INDEX (NNFI) = 0.65 PARSIMONY NORMED FIT INDEX (PNFI) = 0.52 COMPARATIVE FIT INDEX (CF!) = 0.76 INCREMENTAL FIT INDEX (lFI) = 0.77 RELATIVE FIT INDEX (RF!) = 0.65 CRITICAL N (CN) = 39.13 CONFIDENCE LIMITS COULD NOT BE COMPUTED DUE TO TOO SMALL P-VALUE FOR CHI-SQUARE
Application 10
361
Table 12.5 LlSREL OutPUt: Modification Indices for Hypothesized Model MOOIFICATION INDICES FOR GAMMA GSC1 109.82
GSC2 ASC2 AA2
AA1
ASC1 3.09
61.52
EXPECTED CHANGE FOR GAMMA GSC1 GSC2 ASC2 AA2
1.51
AA1
ASC1 -0.23
2.89
NO NON-ZERO MOOIFICATION INDICES FOR PHI MOOIFICATION INDICES FOR PSI
GSC2 ASC2 AA2
GSC2
ASC2
0.78 186.71
36.13
AA2
EXPECTED CHANGE FOR PSI
GSC2 ASC2 AA2
GSC2
ASC2
-0.07 6.50
-0.64
AA2
MOOIFICATION INDICES FOR THETA-EPS
GSCC2 GSCR2 ASCC2 ASCB2 AAET2 AAMA2
GSCC2
GSCR2
ASCC2
ASCB2
17.20 11.58 2.08 14.71
1.08 0.00 13.13 33.27
15.97 1.62
1.83 28.11
AAET2
AAMA2
AAET2
AAMA2
EXPECTED CHANGE FOR THETA-EPS
GSCC2 GSCR2 ASCC2 ASCB2 AAET2 AAMA2
GSCC2
GSCR2
ASCC2
ASCB2
0.68 -1.27 1.35 2.50
0.20 -0.02 4.01 4.46
-1.80 0.39
·1.42 ·4.31
subset of these model components and their related parameter values are incorporated into the diagram.) This selectivity occurs because LISREL 8 displays only modification indices larger than 7.882 which represents the 3Nonetheless, all nonzero modification indices may be displayed by pressing "A" (for all modification indices).
362
Chapter 12
Table 12.5 cont. MODIFICATION INDICES FOR THETA'DELTA-EPS
GSCCI GSCRI ASCCI ASCBl MEn MllAI
GSeC2
GSCR2
ASCC2
ASCB2
MET2
MIIA2
117_29 45_59 4_44 19_50 0_36 4.10
20_ IS 85.77 2.18 B.81 0.04 29.56
1.47 0.12 88.25 38.21 12. 96 0.05
60.54 9.24 5.12 326 _27 0.20 2.52
7.30 6.55 12_76 0.76 564 . 74 125.05
0.49 0_60 4.50 56.85 20.34 305.94
EXPECTED CHANGE FOR THETA-DELTA-EPS
GSCCI GSCRI Aseel AseBI MET 1 MllAi
GSeC2
GSCR2
ASCC2
ASCB2
MET2
MllA2
4.26 -2.80 0.34 -1.58 0.44 -1. IS
-2.09 4.55 -0.28 -1.57 -0.17 -3.67
0.21 -0.06 0.74 -1.09 -1.27 - 0.05
-3.02 '1.24 -0.39 7.38 0.34 -1.10
2.60 '2.58 -1.56 0.86 49.81 -28.28
0.47 -0.55 -0.60 -5.83 -6.75 48.53
MIlA 1
MODIFICATION INDICES FOR THETA-DELTA
GSecl GSCRI Asccl AseBl MET 1 MIlA 1
Gscet
GSeRI
Aseel
AseBt
MET 1
4_89 16_00 36.82 0.80 15_90
0.42 1.18 0.09 10.85
1_02 4.45 1.61
12.83 0.40
42.30
EXPECTED CHANGE FOR THETA-DELTA
GSCCI GSeRI AseCI ASCBl MET 1 MIlA 1
Gscct
GSCRt
ASCCI
AseBI
METt
1.22 0.70 -2.53 0 . 68 2.51
-0.12 -0.48 -0.24 2.20
-0.17 -0.74 0.34
2.77 -0.44
-8.36
MAXIMlJM MOOIFICATION INDEX IS
MHAI
564.74 FOR ELEMENT ( 5, 5) OF THETA DELTA-EPSILON
r:
99.5 percentile of the distribution with one degree of freedom Goreskog & Sorbom, 1993a).4 Unfortunately, at the present time there appears to be no matching diagram for the Theta-Delta-Epsilon matrix as shown in Table 12.5. Although one can obtain the correlated errors model from the Model menu, the modification indices and model components relate only to the ;t it, however, possible to change this default value.
Application 10
363
GSC1
109.82
ASC1
GSC2
ASC2
186.7
36.13
AA1
61.52
AA2
FIG. 12.12. Structural Path Diagram for Hypothesized Model ShoWing Modification Indices.
observed variables. Because interpretation of this path diagram is made extremely difficult as a consequence of overlapping print related to the MI estimates, it is not included here. Given the substantive meaningfulness of correlated errors across time, together with statistical evidence that the estimation of these parameters would lead to a statistically better fitting model, the initially hypothesized model was respecified such that, for each observed measure, the error term at Time 1 was correlated with that at Time 2. It is important to note, however, that whereas this specification would have required reparameterization of the model into either an all-X or an all-Y model using earlier versions of the program, LISREL 8 now has a special matrix designed to accommodate this type of specification; it is termed the Theta matrix and is symbolized as 0 6e• The order of this matrix is q x p, representing the covariance matrix between 8 and E. It is a fixed-zero matrix by default, and cannot be declared as free on the MO line. However, any TH element may be declared free on a FR command. The input file for this respecified model is shown in Table 12.6.
Table 12.6 L1SREL Input: Final Model with Correlated Errors Across Time
M PH(2,2) PHC3,3) FR PH(Z,I) PHC3,1> PHC3,2) FR GA(I,2) GA(I,3) GA(2,I) GA(2,3) GA(3,1) GA(3,2) VA 1.0 LYC1,1) LY(3.2) lY(S,3) VA 1.0 LX(I,I) LX(3,2) LX(S,3) FR TH(I,I) TH(2,2) TH(3,3) TH(4,4) TH(5,S) TH(6,6) OU AD=OFf
Table 12.7 L1SREL OUtput: Goodness-at-fit Statistics
tor
final Model
IDIIIIESS OF FIT STATISTICS CHI-SQUARE WITH 39 DEGREES OF fREEDOM = ~24.80 (P = 0.0) ESHMATEO 1I01l-CENTIIAllTY PARAMETER (IICP) " 285_80 MINIMUM FIT FUNCTION VALUE" 0.35 POPULATION DISCREPANCY FUNCTION VALUE (FO) = 0.31 ROOT MEAII SQUARE ERROR OF APPROXIMATION (RMSEA) = 0 . 089 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 0.43 ECVI FOR SATURATED MOOEL " 0.17 ECVI FOR INDEPENDENCE MODEL" 7.613 CHI - SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM" 7099.45 INDEPENDENCE AIC " 7123.45 MODEL AIC = 402.80 SATURATED Ale ~ 156.00 INDEPENDENCE CAlC ~ 7193.46 MODEL CAlC" 630.33 SATURATED CAlC = 611.06 ROOT MEAN SQUARE RE SI DUAL (RMR) " 1.46 STANDARDIZED RMR ~ 0.059 GOODNESS OF FIT INDEX (GFI) = 0.95 ADJUSTED GOODNESS OF FIT INDEX (AGFI) " 0.89 PARSIMONY GOODNESS OF FIT INDEX (PGFI) " 0.47 NORMED FIT INDEX (NFl) • 0.95 NON - NORMED FIT INDEX (NNFI) " 0 . 93 PARSIMONY NORMED FIT INDEX (PNFI) " 0.56 COMPARATIVE FIT INDEX (CFI) " 0.96 INCREMENTAL FIT INDEX (lFI) ~ 0.96 RELATIVE FIT INDEX (RFI) " 0.92 CRITICAL N (CN) = 179.37 COllflDENCE LIMITS COULD NOT liE COMPUTED DUE TO TOO SMAll P-VAlUE FOR CHI -SQUARE
Application 10
365
Turning first to the goodness-of-fit statistics for this model (see Table 12.7), we can see that specification of error covariances over the two time points certainly led to a highly significant difference in overallfit (LlX2(6) - 1377.63), and very substantial GFI and CFI values of .95 and .96, respectively. It is important to note at least two other statistics related to model fit-the RMSEA and the ECVI. The RMSEA value is now .089, thereby representing a difference of .111 between the the initial and respecified models. Although this value is at the upper bound of acceptability, it nonetheless indicates a substantial improvement in model fit. The other fit statistic, the ECVI, is interpretable only within the context of comparison. Accordingly, comparison of the ECVI for the respecified model (0.43) with the ECVI for the initially hypothesized model (1.91) also yields a substantial difference (1.48), once again indicating the substantial improvement in fit of the respecified models. Before turning to the series of path diagrams related to the respecified model, let us first review the model estimates, as reported in the output file; these statistics are shown in Table 12.8. At least three important features of these estimates are worthy of note. First, except for one parameter (\jI33), all parameter estimates are statistically significant. 5 Of particular import here are the Theta parameters that, of course, represent the correlated error terms across time. Indeed, the memory effect associated with the first achievement measure (AAET) , is clearly substantial; this measure represents a standardized achievement test. Second, notice that the pattern of estimated values in the Phi matrix remains unchanged from the , estimates of the initially hypothesized model (i.e., the variance of ASCl is less than each of the covariances); no problems in convergence, however, were encountered. Finally, recall that the variance of the disturbance term for AAl (PS[3,3] in the initially hypothesized model was negative. Although this parameter, for the respecified model, is not statistically significant, it is no longer a negative value. Now let us have a look at the various path diagrams that are at your disposal in using LISREL 8. It is important to remember that each of these diagrams is obtained by clicking both the Model and Kind buttons on the Path diagram menu bar. Presented first (Fig. 12.13), we see the path diagram for the basic model, accompanied by the maximum likelihood estimates. Shown next, in Figure 12.14, is the same model, but with the t-values rather than the estimates displayed. 6 5Nonetheless, the standard errors, for the most part, are disturbingly high. ~he
missing paths, of course, are a consequence of the parameters that were fixed to 1.00.
Table 12.8 LI SREL Oute!!t: Estimates for Final Model NlIIIber of I terat ions = 21 L1SREL ESTIMATES (MAXIMUM LIKELIHOOD) LAMBDA-Y GSC2 GSCC2
1.00
GSCR2
1.22 (O.OS) 14.77
ASC2
ASCC2
1.00
ASCB2
4.72 (0.32) 14.96
AA2
AAET2
1.00
AAHA2
1.52 (0.10) 15.03 LAMBDA-X GSCl
GSCCl
1.00
GSCRl
1.05 (0.06) 16.35
ASCl
ASCCl
1.00
ASCBl
3.71 (0.23) 16.40
AAl
AAETl
1.00
AAHAl
2.0S (0.14) 14.67 GAMMA GSCl
GSC2
366
ASC2
0.11 (0.01) 9.25
AA2
-1.21 (0.14) -S.43
ASCl
AAl
4.S2 (0.47) 10.30
-0.7S (0.10) -7.71 0.15 (0.01) 10.92
7.86 (0.64) 12.31
Table 12.8 cont. PHI GSCI
ASCI
AAI
GSCI
11.32 (1.01 ) 11.21
ASCI
2.48 (0.23) 10.87
1.07 (0.12) 8.78
AAI
3.89 (0.72) 5.41
4.08 (0.39) 10.42
20.14 (2.52) 7.99
GSC2
Asc2
AA2
4.27 (0.50) 8.57
0.18 (0.03) 5.40
1.33 (0.74) 1.81
GSCC2
GSCR2
ASCC2
ASCB2
AAET2
AAMA2
7.46 (0.74) 10.05
9.65 (1.08) 8.97
2.28 (0.12) 19.75
7.37 (0.95) 7.75
(4.05) 19.05
77.13
40.97 (4.60) 8.91
GSCR2
ASCC2
ASCB2
AAET2
AAMA2
PSI
THETA'EPS
THETA 'DEL TA' EPS GSCC2 GSCCI
GSCRI
ASCCI
ASCBl
AAETI
AAMAI
2.83 (0.54) 5.26 2.91 (0.67) 4.32 0.75 (0.09) 8.70 5.78 (0.59) 9.78 47.60 (2.91) 16.38 21.04 (4.02) 5.24
367
368
Chapter 12
Table 12.8 cont. THETA-DELTA GSCC1
GSCR1
ASCC1
ASC81
AAET1
AAMA1
8.22 (0.70) 11.72
9.06 (0.78) 11.55
2.31 (0.12) 20.10
10.99 (0.67> 16.31
50.40 (2.57> 19.60
27.91 (4.00) 6.98
GSCCl
GSCC2 1.00
1.00
GSCR1
-1.05
GSC1
GSC2
1.22
GSCR2
- . 78 4.82 ASCC1
.11
1.00
ASCBl
71 33.71
1.00 ASC2 Asc2
ASC1
4.72
.15
ASCB2
7.86 1.'.21_ IAAET1
-1.00 2.08
IAAMAl
AA1
AA2
1.00-
AAET2
1.52
AAMA2
FIG. 12.13. Basic Path Diagram for Model 2 (Final) Showing Maximum likelihood Estimates.
The path diagram shown in Fig. 12.15 represents the Correlated error model, along with the maximum likelihood estimates related to these parameters. This same model, with the t-values displayed, is shown in Fig. 12.16. LISREL 8 also makes it possible for you to display path diagrams related to either the X- or Y-models. Presented first, is the X-model with its estimates displayed (Fig. 12.17), and then with its t-values displayed (Fig. 12.18); these are followed by the same pair of models and accompanying statistics for the Y-model (Figs. 12.19 and 12.20).
GSCC1
GSCR1
GSCC2
16.35
GSC1
GSC2
14.77
GSCR2 GSCR21
10.30
ASCC1
ASCC2
9.25
ASC1 15.40
ASC2 14.96
10.92
ASCB1
ASCB2
12.31 -8.43
AAEn
AA1
?.??
AA2 AA2
AAET2
14.67
15.03
AAMA1
AAMA2
FIG. 12.14. Basic Path Diagram for Model 2 (Final) Showing T-Values. 8.22- GSCC1
2.83
GSCC27.46
9.06- :GSCR1
2.91
GSCR29.65
2.31- ASCC1
75
ASCC2
2.28
10.99- ASCB1
5.78
ASCB27.37
50.40- AAET1
47.60
AAET277.13
27.91 AAMA1
2 1.04
AAMA2
40. 97
FIG. 12.15. Correlated Errors Path Diagram for Model 2 (Final) Showing Maximum.
369
11. 72· GSCC1
·5.26-
GSCC2
10.05
11. 55· GSCR1
·4.32-
GSCR2
B.97
20.10
!'.SCC1
-8.70-
!'.SCC2
19.75
16.31
ASCB1
-9.78-
ASCB2
7.75
19.60
AAET1
16.38
AAET2
19.05
6.98
AAMA1
-5.24-
AAMA2
8.91
FIG. 12.16. Correlated Errors Path Diagram for Model 2 (Final) Showing T-Values. 8.22
GSCC1 1.00
9.06
GSCR1
2.31
ASCC1
1.05
GSC1
11.32
2.48 2.48 '1.00.
ASC1 10.99
ASCB1
-1.07
3.89
3.714.00
50.40
MEn
27.91
AAMA1
1.00
AA1
20.14
2.08
FIG. 12.17. X-Model Path Diagram for Model 2 (Final) Showing Maximum Likelihood Estimates.
370
11.7:
GSCC1
11.55
GSCR1
20.10
ASCC1
16.35-
GSC1
11.21
10;87
ASC1
1!.78
5.41
16.40· 16.31
ASCB1 10:42
19.80
AAET1
6.9&
AAMA1
AA1
-7.99
14.67 1'~67
FIG. 12.18. X-Model Path Diagram for Model 2 (Final) Showing T-Values.
GSCC2
746
GSCR2
9.85
ASCC2
2.28
ASCB2
-737
AAET2
77.13
AAMA2
40.97
1.00
10.65
GSC2
1.22
1.!lO 1.00"
4.:sa
.93
ASC2 "4.72,
4.79
36.75·
AA2
1.00
1.52
FIG. 12.19. Y-Model Path Diagram for Model 2 (Final) Showing Maximum likelihood Estimates.
371
372
Chapter 12
GSCC2 >10.05 IGSCC2 r10. 05
GSC2
14.77
GSCR2 -8.97
ASCC2 r19. -19.75 IASCC2 75 ASC2 14.96
ASCB2 -7.75
AAET219.05 IAAET2r19.05
AA2
15.03 AAMA2
8.91
FIG. 12.20. V-Model Path Diagram for Model 2 (Final) Showing T -Values.
Finally, for sake of comparison with Fig. 12.11, let us review the Structural model presented in Fig. 12.21 with its estimates displayed, and in Fig. 12.22, with its t-values displayed. As with the structural path diagram for the initially hypothesized model (see Fig. 12.11), note that all disturbance terms are reported as negative values on the diagram. However, a quick check of the output file (see Table 12.8) will confirm this not to be the case. U nfortunately, as has been the case for a few of the other program-produced path diagrams, the overlapping print on the structural paths in both Figs. 12.21 and 12.22 makes it difficult, if not impossible, to interpret the reported 7 values. However, one can always refer to the output file. Let us turn now to the modification indices pertinent to the respecified model. Because our primary focus has been centered on the structural component of the model, only modification statistics related to these parameters are reported in Table 12.9.
71 trust that these difficulties will be remedied in the next version of the program.
1t:i.32 GSC1
GSC2 & 4 . 24.27 7
4.828
2.48 2.48
11
3.89 3.89
1.07
ASC1
ASC2
18
.15 15
4.08 4.08
1")s~1_ ~.14
AA1
AA2
1.33
FIG. 12.21. Structural Path Diagram for Model 2 (Final) Showing Maximum Likelihood Estimates.
U.21
GSC1
GSC2
-8.57
ASC2
5.40
1~~
10.87
9.25 5.41 5.41
8.78
ASC1 10.92
10.42 10.42
3> a4.3 11"2 '2'e'i)4. :'1.99
AA1
AA2
1.81
FIG. 12.22. Structural Path Diagram for Model 2 (Final) Showing T-Values.
373
374
Chapter 12
Table 12.9 L1SREL OutPUt: Modification Indices for Structural Parameters of Final Model IIIDIFICATIIII IIIlICES FOR GAIIIA GSCI GSC2 ASC2 AA2
15.75
ASCI 78.88
AAI
29.54
EXPECTED CHANGE FOR GAMMA GSCI GSC2 ASC2 AA2
1.28
ASCI ·2.66
AAI
4.02
IIIDIFICATIIII I III ICES FOR PSI
GSC2 ASC2 AA2
GSC2
ASC2
69.32 60.81
26.35
AA2
EXPECTED CHANGE FOR PS I
GSC2 ASC2 AA2
GSC2
ASC2
0.50 2.97
0.38
AA2
A review of these values suggests that a substantial degree of misfit still remains in the model. In particular, it appears that the incorporation of a path from ASC1 to ASC2, as indicated by the modification index of 78.88, should lead to a substantially better fitting model; in addition, note these indices with respect to AA1 and AA2, as well as to covariances among the disturbance terms. However, it is important that these modification indices be assessed in combination with their related expected change values. In a study of both of these statistics, Saris et al. (1987) distinguished among four scenarios: (a) large modification index, large expected parameter change; (b) large modification index, small expected parameter change; (c) small modification index, large expected parameter change; and (d) small modification index, small expected parameter change. The first situation provides strong evidence that the fixed parameter in question should be freely estimated, whereas situations (b) and (d) argue against estimation and situation (c) is
375
Application 10
somewhat ambiguous. This important work by Saris and colleagues, therefore, would argue for model respecification that incorporated a path from AA1 to AA2, ratherthan from ASC1 to ASC2. Indeed, the negative expected parameter change with respect to the latter path (-2.66) would seem to be somewhat incompatible with self-concept theory, as well as being substantively unreasonable. On the basis of this rationale, I would argue that any subsequent respecification of the model should incorporate the former, rather than the latter, structural path. Now let us view the LISREL path diagrams related to these modification indices. Shown in Figs. 12.23 and 12.24 are the Basic and Structural models, respectively, together with their associated modification index values. Figure 12.25 displays the expected parameter change statistics related to the Structural model. GSCC2 IGSCC21
IGSCC11
GSC1
GSCR1
15.75
GSC2
25.06 25.06,
GSCR2 37g .9!s
ASCC1
!'.SCC2
3.1 "33.1Q
ASC1
78 88 78.88
23.4~
ASC2
.40.06 ASCB1
!'.SCB2 IASCB21
16.38
AAETl
AA1
29.54 29.54-
AA2
AAET2
3I!9•..:I'i:a
AAMA1
AAMA2 ~2
FIG. 12.23. Basic Path Diagram for Model 2 (Final) Showing Modification Indices.
The WINDOWS version ofLISREL 8 provides you with enormous visual flexibility in determining the viability of model respecification. More specifically, it allows you to add paths or error covariances on an existing path diagram, to display the expected estimates for these additional parameters, and to estimate your revised diagram simply by clicking on the Re-Estimate
GSC1
15 75 -15.75-
GSC2
69.32
ASCI
78.88-
ASC2
60.81
26.35
AA1
29 54 -29.54-
AA2
FIG. 12.24. Structural Path Diagram for Model 2 (Final) Showing Modification Indices.
GSC1
1. 28 28-
GSC2
.50 50
ASCI
2.66
2.97
ASC2
.38 38
AA1
-4.024 02
AA2
FIG. 12.25. Structural Path Diagram for Model 2 (Final) Showing Expected Parameter Change Estimates.
376
377
Application 10
button; likewise, the program allows you to delete paths, error covariances, or both. Although I demonstrate only the addition of one path here, readers are referred to Joreskog and Sorbom (1993a) for details related to other model respecifications. To demonstrate the addition of a path to an existing diagram, we return to our respecified model described earlier (Model2). Taking into account the modification indices and expected parameter change values (see Table 12.9), we determined that a logical next step in respecifying the model would be to incorporate a path flowing from AA1 to AA2. To define this path in the model, you begin by pressing any arrow key. Using the arrow key to place the cursor on AA2, press the Enter key; this action triggered the warning message shown in Fig. 12.26. After clicking on the Yes button shown in Fig. 12.26, the program proceeded to reestimate the model. I subsequently placed the cursor once again on AA2 and pressed the Enter key; this action caused the path to be
To change the model h must be estimated first! Press 'YES' 10 do thl".
FIG. 12.26. Warning Message Initiated of Model 2 (Final).
by Addition of Path to Existing Path Diagram
378
Chapter 12
GSCC2 1
IGSCC1 GSC1
GSCR1
GSC2
ASCC2
ASCC1 ASC1 ASCB1
AAET1
AAMA1
GSCR2
ASC2
ASCB2
.
AA1
AA2
~
FIG. 12.27. Basic Path Diagram for Model 2 (Final) Showing Added Path from M1 toM2.
added to the model, along with its expected estimated value, as can be seen in Fig. 12.27. For purposes of demonstration here, I then selected t-values from the Kind drop-down menu; given that the model had not yet been estimated, this resulted in the display of question marks relative to this path as shown in Fig. 12.28. Finally, clicking on the Re-Estmate button resulted in the error message shown in Fig. 12.29.8 Given these results for the latter respecified model (Model 3), we can conclude that Model 2 represents the best fit to the data. Considering Model 2 as the final model, we can now proceed in testing for causal predominance. Although three primary hypotheses can be tested in this regard, only one is discussed here. Specifically, we test the hypothesis that academic achievement is causally predominant over academic self-concept (i.e., AA causes ASC, rather than the reverse). This is accomplished by first estimating a 8Interestingly, this model was estimated using the DOS version with no convergence problems whatsoever. Nonetheless, the disturbance term associated with AA2 once again yielded a negative variance estimate.
Application 10
379
GSCC1
GSCR1
GSCC2
16.35
GSC1
GSC2
14.77
GSCR2
-7.71 10.30
ASCC!
ASCC2 9.25
ASC2
ASC1 ASCB1
16.40
10.92
ASCB2
12.31 -8.43
AAET1
AA1
14.67
AAMA1
AA2
AAET2
15.03
AAMA2
FIG. 12.28. Basic Path Diagram for Model 2 (Final) Showing Unknown T-value for Added Path.
model in which the competing paths are constrained equal (e.g., AA1 ~ ASC2 - ASC1 ~ AA2), and then comparing the fit of this model with one in which the same paths are specified as free. The difference in X2 values between the two models provides the basis for determining statistical significance, and the size of the parameter estimates determines which of the paths is causally dominant. A review of Fig. 12.8 will confirm that the testing of this hypothesis involved the comparison of paths "{32 and "{23. The input file for this test of equality is shown in Table 12.10. In reviewing this input file, you will note that, in addition to the EQ command, the keyword 1M =3 has been added to the OU line. Interestingly, it appears that inclusion of the 1M keyword has been made necessary by the formulation of the Theta matrices specific to LISREL 8. Indeed, tests of the same hypotheses, based on earlier versions of the program, demanded that the model be specified within the framework of the all-X model; the 1M keyword was not necessary. However, when the model was estimated here
Chapter 12
380
FIG. 12.29. Error Message from Estimation of Model 3.
with specification of the new Theta matrix {E>1ie)-albeit 1M not included on the OU line-a convergent solution could not be achieved. Why should the testing of the same model precipitate these inconsistent findings? K. G.Joreskog (personal communication, November 26,1996) explained that when such convergence difficulties arise, the situation can often be resolved by using the Fisher's Scoring algorithm rather than the Davidon-Fletcher-Powell algorithm, which is the default in LISREL 8. Essentially, the difference between the two algorithms lies with the manner by which the inverse of the information matrix is computed. Whereas the specification 1M =3 causes the information matrix to be recomputed and inverted with each iteration, the specification of IM=2 (the default value) results in the inverse of the information matrix merely being updated with each iteration. Thus, because the 1M =3 specification precipitates more work in each iteration, it requires fewer iterations in order to reach convergence. On the other hand, given that the IM=2 specification has been shown to be superior in terms of total computer time, it has been selected as default for the LISREL program.
381
Application 10
Results determined the difference in 'l between the two estimated models to be statistically significant (aX 2() - 13.26}. To determine which of the two paths was causally predominant, we turn to the final model estimates (see Table 12.8). Because the estimated path leading from academic self-concept to academic achievement (ASC1~ AA2) was higher (7.86) than the one leading from academic achievement to academic self-concept (AA1~ ASC2 = 0.15), we conclude that, in contrast to the stated hypothesis, academic self-concept is causally predominant over academic achievement.
Table 12.10 L1SREL Input File: Testing for Equality of Paths ! Testing for Causal predominance between SC and AA "SCAAEQ" I Specific test of Equal ity between ASC and AA ! Error Covariances Across Time (TH matrix) I ASCI '>AA2 = AAI ·>ASC2 OA NI=19 NO=929 MA=CM LA SCHOOL LEVEL ABILITY CLASS 10 GRADE SEX ASCC2 ASCB2 GSCC2 GSCR2 AAET2 AAMA2 ASCCl ASCB 1 GSCCI GSCR 1 AAET 1 AAMA 1 RA=NOX.OAT FO (3Fl .0, F2 .0, F3 .O,2Fl.0, 2X,6(4X, F4.0), T13,6( F4.0 ,4X» SELECTION 10 11 8 9 12 13 16 17 14 15 18 19/ MO NY=6 NX=6 NE=3 NK=3 LX=FU,FI LY=FU,F1 GA=FU,FI PH=SY,FI PS=SY,FI TE=SY TD=SY LE GSC2 ASC2 AA2 LK GSCI ASCI AA 1 FR LY(2,1) LY(4,2) LY(6,3) FR LX(2,1) LX(4,2) LX(6,3) FR PH(l,l) PH(2,2) PH(3,3) FR PH(2,l) PH(3,l) PH(3,2) FR GA(l,2) GA(l,3) GA(2,l) GA(2,3) GA(3,l) GA(3,2) FR PS(l,l) PS(2,2) PS(3,3) VA 1.0 LY(l,l) LY(3,2) LY(5,3) VA 1.0 LX(I,l) LX(3,2) LX(5,3) FR TH(l,l) TH(2,2) TH(3,3) TH(4,4) TH(5,5) TH(6,6) EQ GA(2,3) GA(3,2) au 114=3 AD=OFF
FROM THE SIMPLIS PERSPECTIVE Thus far in this chapter, I have demonstrated various operations associated with the WINDOWS version of LISREL 8. In particular, I have illustrated how to (a) work from a path diagram in respecifying and reestimating a model, (b) view particular components of a specified model, along with selected statistics, and (c) add or delete existing paths or covariances from an existing path diagram. However, another feature of the WINDOWS version
Chapter 12
382
is that it allows you to build an input file from an empty path diagram. I have chosen to illustrate this procedure from a SIMPLIS file. For purely pedagogical reasons, I base this SIMPLIS application on a restructured data file that eliminated the redundant variables shown in the LISREL 8 files. By doing so, I have another opportunity to show you how this reduced data set may be created using PRELIS 2.
ASCC1
ASCB1
ASCC2
GSC1
GSC2
GSCC1
G5CC2 ASC1
ASC2
GSCR1
AAET1
ASCB2
G5CR2
AA1
AA2
AAMA1
AAET2
AAMA2
FIG . 12.30. Empty Path Diagram for Hypothesized Model
Table 12.11 PRELIS Input: creation of Smaller Data Set ! Creation of Smaller Data Set for Use with SIMPLIS "SCAASIMP" ! Two-wave Panel CFA Model DA NI=19 NO=929 MA=CM LA SCHOOL LEVEL ABILITY CLASS ID GRADE SEX ASCC2 ASCB2 GSCC2 GSCR2 AAET2 AAMA2 ASCCl ASCBl GSCCl GSCRl AAETl AAMAl RA=NOX.DAT FO (3Fl.O,F2.0,F3.0,2Fl.O,2X,6(4X,F4.0),T13,6(F4.0 , 4X» SD SCHOOL-SEX ! Only Variables ASCC2-AAMAl in Analysis CO ALL OU RA=SCAASIMP.DAT WI=2 ND=O
Application 10
383
The PRELIS 2 input file is shown in Table 12.11. In reviewing this file, you will note that the original data (from which the subset will be derived) is the same one that was used for the LISREL analyses (see Table 12.1). The SD keyword tells the program to delete all variables up to and including SEX, and the CO keyword indicates that all variables in the new data set are to be treated as having a continuous scale. Finally, the OU line indicates that the new raw data file will bear the label "SCAASIMP.DAT," with each variable comprising two columns and no decimal places. Within the WINDOWS environment, therefore, this data set is created simply by clicking on the PRELIS 2 button on the main menu bar. Now let us turn to the SIMPLIS input file to be used in structuring an empty model; this file is displayed in Table 12.12. Aside from the name of the data file and sample size, all that is required in this input file is that you (a) specify the names of the observed variables, (b) identify which of these represent dependent variables, and (c) include the PATH DIAGRAM command. Because the path diagram that we wish to produce comprises both latent and observed variables, it is imperative that we specify the dependent variable names for each. Thus, the former are identified as ETA-Variables, and the latter as Y-Variables. By clicking on the LISREL 8 button in the main menu, the program will then produce an empty diagram such as the one shown in Fig. 12.30. Using the arrow keys, as illustrated earlier in the addition of a path (see Fig. 12.27), you then assemble the various components of your path diagram and estimate the resulting model. Although this brief introduction can, at best,
Table 12.12 SIMPLIS Input: creation of An Empty Model for Two-wave Panel study
! Testing for Causal Predominance Between SC and AA "SCAAEM.SIM" ! Two-wave Panel CFA Model ! Creating an Empty Model OBSERVED VARIABLES: ASCC2 ASCB2 GSCC2 GSCR2 AAET2 AAMA2 ASCCl ASCBl GSCCl GSCRl AAETl AAMAl RAW DATA from FILE SCAASIMP.DAT SAMPLE SIZE: 929 LATENT VARIABLES: GSC2 ASC2 AA2 GSCl ASCl AAl ETA-VARIABLES: GSC2 ASC2 AA2 Y-VARIABLES: GSCC2 GSCR2 ASCC2 ASCB2 AAET2 AAMA2 PATH DIAGRAM END OF PROBLEM
384
Chapter 12
Table 12.13 SIMPLIS Input: Final Model of Causal predominance I Testing for Testing for Causal Predominance Between SC and AA "SCAAF.SIM" I Two-wave Panel CFA Model I Final Model OBSERVED VAR IABLES: ASCC2 ASCB2 GSCC2 GSCR2 AAET2 AAMA2 ASCC1 ASCB1 GSCC1 GSCR1 AAETI AAMAI RAW DATA from FILE SCAASIMP.DAT SNlPLE SIZE: 929 LATHT VARIABLES: GSCZ ASC2 AAZ GSC 1 ASC I AA I EClUATICIIS: GSCCZ • I *GSCZ GSCRZ . GSC2 ASCCZ • 1*ASCZ ASCBZ· ASCZ AAETZ = 1*AA2 AAMAZ = AA2 GSCC1 • 1*GSC1 GSCR1 = GSC1 ASCCI = 1*ASC1 ASCBI = ASCI AAET1 = 1*AA1 AAMA1 · AA1 PATHS: GSC1 -, ASCl AA2 ASCI - , GSCZ AA2 AA1 -, GSCZ ASe2 Let the errors between GSCC1 and GSeez correlate Let the errors between GSCR1 and GSCRZ correlate Let the errors between ASeC1 and Asee2 correlate Let the errors between ASCB1 and ASCBZ correlate Let the errors between AAET1 and AAETZ correlate Let the errors between AAMA1 and AAMAZ correlate PATH DIAGRAII LI SREL output AD=OF F END OF PROBLEM
give you only a flavor of the entire file-building procedure, space limitations restrict further elaboration of this technique here. Thus, I recommend that interested readers refer to Joreskog and Sorbom (1993a) for additional details and illustrations. For sake of completion, I conclude this chapter by reviewing the SIMPLIS file equivalent to the LISREL input file representing the final model of best fit used in testing for causal predominance between academic achievement and academic self-concept (see Table 12.6); the SIMPLIS input file is shown in Table 12.13. In reviewing this SIMPLIS file, I see three features that are perhaps worthy of note. First, in comparing this file with the LISREL file, you will observe that there are fewer observed variables; this is the case, of course, because the SIMPLIS analysis is based on the reduced data set (SCAASIMP.DAT). Second, for purposes of illustration and comparison with other SIMPLIS
Application 10
385
files, the structural equations have been presented as structural paths; as such, the paths are indicated by means of arrows. For example, the first path indicates that GSCl causes ASC2 and AA2. Finally, consistent with the LISREL file, all measurement errors are specified as correlated across time.
APPLICATIONS RELATED TO OTHER DISCIPLINES Education: Reynolds, A. J., & Walberg, H. J. {1991}. A structural model of science achievement. Journal of Educational Psy. chology, 83,97-107. Gerontology: Schaie, K. W., Maitland, S. B., Willis, S. L., & Intrieri, R. C. {1997}. Longitudinal invariance of adult psychometric ability factor structures across seven years. Psychology and Aging, 12, 1-18. Marketing: Brown, S. P., Cron, W. L., & Slocum, Jr., J. W. {1997}. Effects of goal-directed emotions on salesperson volitions, behavior, and performance: A longitudinal study. Journal ofMarketing, 61, 39-50. Medicine: Wickrama, K., Conger, R. D., Lorenz, F. 0., & Matthews, L. {1995}. Role identity, role satisfaction, and perceived physical health. Social Psychology Quarterly, 58,270-283.
REFERENCES Abramson, L. Y., Metalsky, G. I., & Alloy, L. B. (1988). The hopelessness theory of depression: Does the research test the theory? In L. Y. Abramson (Ed.), Social cognition and clinical psychology: A synthesis (pp. 33-65). New York: Guilford. Aitchison, J., & Silvey, D. C. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals ofMathematical Statistics, 29, 813-828. Aish, A. M., & Joreskog, K. G. (1990). A panel model for political efficacy and responsiveness: An application of LISREL 7 with weighted least squares. Quality and Quantity, 19,716-723. Akaike, H. (1987). Factor analysis and AlC. Psychometrika, 52, 317-332. Aldrich, J. H., & Nelson, F. D. (1984). Linearprobability,logit, and probit models. Beverly Hills, CA: Sage. Anderson, J. C., & Gerbing, D. W. (1984). The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis. Psychometrika, 49,155-173. Anderson,]. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103,411-423. Arbuckle,J. L. (1995). AMOS user's guide. Chicago: Smallwaters. Atkinson, L. (1988). The measurement-statistics controversy: Factor analysis and subinterval data. Bulletin ofthe Psychonomic Society, 26, 361-364. Austin,]. T., & Calder6n, R. F. (1996). Theoretical and technical contributions to structural equation modeling: An updated annotated bibliography. Structural Equation Modeling: A Multidisciplinary Journal, 105-175. Babakus, E., Ferguson, C. E., Jr., & Joreskog, K. G. (1987). The sensitivity of confirmatory maximum likelihood factor analysis to violations of measurement scale and distributional assumptions. Journal ofMarketing Research, 24, 222-228. Bacharach, S. B., Bauer, S. C., & Conley, S. (1986). Organizational analysis of stress: The case of elementary and secondary schools. Work and Occupation, 13,7-32. Bagozzi, R. P. (1993). Assessing construct validity in personality research: Applications to measures of self-esteem. Journal ofResearch in Personality, 27, 49-87. Bagozzi, R. P., & Yi, Y. (1990). Assessing method variance in multitrait-multimethod matrices: The case of self-reponed affect and perceptions at work. Journal ofApplied Psychology, 75,
547-560.
Bagozzi, R. P., & Yi, Y. (1983). Multitrait-multimethod matrices in consumer research: Critique and new developments. Journal of Consumer Psychology, 2, 143-170. Bandalos, D. L. (1983). Factors influencing cross-validation of confirmatory factor analysis models. Multivariate Behavioral Research, 28, 351-374. Beek, A.T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561-571. Benson, J. (1987). Detecting item bias in affective scales. Educational and Psychological Measure·
ment, 47, 55-67.
Benson, J., & Bandalos, D. L. (1992). Second-order confirmatory factor analysis of the Reactions to Tests Scale with cross-validation. Multivariate Behavioral Research, 27, 459-487. Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling. Annual
Review ofPsychology, 31, 419-456.
387
388
References
Bentler, P. M. (1988). Causal modeling via structural equation systems. In]. R. Nesselroade & R. B. Cattell (Eds.), Handbook ofmultivariate experimental psychology (2nd ed., pp. 317-335). New York: Plenum. Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246. Bentler, P. M. (1992). On the fit of models to covariances and methodology to the Bulletin. Psychological Bulletin, 112, 400-404. Bentler, P. M. (1995). EQS: Structural equations program manual. Encino, CA: Multivariate Software Inc. Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606. Bentler, P. M., & Bonett, D. G. {1987}. This week's citation classic. Current Contents, Institute for Scientific Information, 9, 16. Bentler, P. M., & Chou, C. P. (1987). Practical issues in structural modeling. SociologicalMethods & Research, 16,78-117. Bentler, P. M., & Wu, E.]. C. (1995). EQS for Windows: User's guide. Encino, CA: Multivariate Software Inc. Biddle, B. ]., & Marlin, M. M. (1987). Causality, confirmation, credulity, and structural equation modeling. Child Development, 58, 4-17. Bollen, K. A. (1989a). A new incremental fit index for general structural models. Sociological Methods & Research, 17,303-316. Bollen, K. A. (1989b). Structural equations with latent variables. New York: Wiley. Bollen, K. A, & Barb, K. H. (1981). Pearson's r and coursely categorized measures. American Sociological Review, 46, 232-239. Bollen, K. A., & Long, J. S. (Eds.). (1993). Testing structural equation models. Newbury Park, CA: Sage. Boomsma, A. (1982). The robustness of LISREL against small sample sizes in factor analysis models. In H. Wold & K. ] oreskog (Eds.), Systems under indirect observation (pp. 149-173). New York: Elsevier North-Holland. Boomsma, A. (1985). Nonconvergence, improper solutions, and starting values in LISREL maximum likelihood estimation. Psychometrika, 50, 229-242. Bozdogan, H. {1987}. Model selection and Akaike's information criteria (AlC): The general theory and its analytical extensions. Psychometrika, 52, 345-370. Breckler, S.]. (1990). Applications of covariance structure modeling in psychology: Cause for concern? Psychological Bulletin, 107,260-271. Brookover, W. B. (1962). Self-Concept ofAbility Scale. East Lansing, MI: Educational Publication Services. Browne, M. W. (1982). Covariance structures. In D. M. Hawkins (Ed.), Topics in applied multivariate analysis (pp. 72-141). Cambridge, England: Cambridge University Press. Browne, M. W. {1984a}. Asymptotically distribution-free methods for the analysis of covariance structures. British journal ofMathematical and Statistical Psychology, 37,62-83. Browne, M. W. (1984b). The decomposition of multitrait-multimethod matrices. The British journal ofMathematical and Statistical Psychology, 37, 1-21. Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for covariance structures. Multivariate Behavioral Research, 24, 445-455. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A Bollen &]. S. Long (Eds.), Testing structural equation models (pp. 445-455). Newbury Park, CA: Sage. Browne, M. W., Mels, G., & Coward, M. (1994). Path analysis: RAMONA: SYSTATfor DOS: Advanced applications (version 6). Evanston, IL: SYSTAT. Byrne, B. M. (1986). Self-concept/academic achievement relations: An investigation of dimensionality, stability, and causality. Canadian journal of Behavioural Science, 18, 173-186.
References
389
Byrne, B. M. (1988a). Adolescent self-concept, ability grouping, and social comparison: Reexamining academic track differences in high school. Youth and Society, 20, 46-67. Byrne, B. M. (1988b). Measuring adolescent self-concept: Factorial validity and equivalency of the SDQ III across gender. Multivariate Behavioral Research, 24, 361-375. Byrne, B. M. (1988c). The Self Description Questionnaire Ill: Testing for equivalent factorial validity across ability. Educational and Psychological Measurement, 48, 397-406. Byrne, B. M. (1989). A primer ofUSREL: Basic applications and programmingfor confirmatory factor analytic models. New York: Springer-Verlag. Byrne, B. M. (1991). The Maslach Inventory: Validating factorial structure and invariance across intermediate, secondary, and university educators. Multivariate Behavioral Research, 26, 583-605. Byrne, B. M. (1993). The Maslach Inventory: Testing forfactorial validity and invariance across elementary, intermediate, and secondary teachers. Journal of Occupational and Organiza· tional Psychology, 66, 197-212. Byrne, B. M. (1994a). Burnout: Testing for the validity, replication, and invariance of causal structure across elementary, intermediate, and secondary teachers. American Educational Research Journal, 31, 645-673. Byrne, B. M. (1994b). Testing for the factorial validity, replication, and invariance of a measuring instrument: A paradigmatic application based on the Maslach Burnout Inventory. Multivariate Behavioral Research, 29, 289-311. Byrne, B. M. (1995). Strategies in testing for an invariant second-order factor structure: A comparison of EQS and LISREL. Structural Equation Modeling: A Multidisciplinary Journal 2,53-72. Byrne, B. M. (1996). Measuring self-concept across the lifespan: Issues and instrumentation. Washington, DC: American Psychological Association. Byrne, B. M. (in press). The nomological network of teacher burnout: A literature review and empirically validated model. In M. Huberman & R. Vandenberghe (Eds.), Understanding and preventing teacher burnout: A sourcebook of international research and practice. Cambridge, England: Cambridge University. Byrne, B. M., & Baron, P. (1993). The Beck Depression Inventory: Testing and cross-validating an hierarchical factor structure for nonclinical adolescents. Measurement and Evaluation in Counseling and Development, 26, 164-178. Byrne, B. M., & Baron, P. (1994). Measuring adolescent depression: Tests of equivalent factorial structure for English and French versions of the Beck Depression Inventory. Applied Psychology: An International Review, 43, 33-47. Byrne, B. M., Baron, P., & Balev, j. (1996). The Beck Depression Inventory: Testing for its factorial validity and invariance across gender for Bulgarian adolescents. Personality and Individual Differences, 21, 641-651. Byrne, B. M., Baron, P., & Balev,j. (in press). The Beck Depression Inventory: A cross-validated test of second-order structure for Bulgarian adolescents. Educational and Psychological
Measurement.
Byrne, B. M., Baron, P., & Campbell, T. L. (1993). Measuring adolescent depression: Factorial validity and invariance of the Beck Depression Inventory across gender. Journal ofResearch on Adolescence, 3, 127-143. Byrne, B. M., Baron, P., & Campbell, T. L. (1994). The Beck Depression Inventory (French version): Testing for gender-invariant factorial structure for nonclinical adolescents. Journal ofAdolescent Research, 9, 166-179. Byrne, B. M., Baron, P., Larsson, B., & Melin, L. (1995). The Beck Depression Inventory: Testing and cross-validating a second-order factorial structure for Swedish nonclinical adolescents. Behaviour Research and Therapy, 33, 345-356. Byrne, B. M., Baron, P., Larsson, B., & Melin, L. (1996). Measuring depression for Swedish nonclinical adolescents: Factorial validity and equivalence of the Beck Depression Inventory Across Gender. Scandinavian Journal ofPsychology, 37,37-45.
390
References
Byrne, B. M., & Bazana, P. G. (1996). Investigating the measurement of social and academic competencies for early/late preadolescents and adolescents: A multitrait-multimethod analysis. Applied Measurement in Education, 9, 113-132. Byrne, B. M., & Goffin, R. D. (1993). Modeling MlMM data from additive and multiplicative covariance structures: An audit of construct validity concordance. Multivariate Behavioral Research, 28, 67-96. Byrne, B. M., & Shavelson, R.J. (1986). On the structure of adolescent self-concept. Journal of EduC4tional Psychology, 78,474-481. Byrne, B. M., & Shavelson, R. J. (1987). Adolescent self-concept: Testing the assumption of equivalent structure across gender. American EdUC4tional Research Journal, 24, 365-385. Byrne, B. M., & Shavelson, R. J. (1996). On the structure of social self-concept for pre-, early, and late adolescents. Journal ofPersonality and Social Psychology, 70, 599-613. Byrne, B. M., Shavelson, R. J., & Muthen, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105,456-466. Byrne, B. M., & Worth Gavin, D. A. (1996). The Shavelson model revisited: Testing for the structure of academic self-concept across pre-, early, and late adolescents. Journal ofEducational Psychology, 88, 215-228. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105. Carlson, M., & Mulaik, S. A. (1993). Trait ratings from descriptions of behavior as mediated by components of meaning. Multivariate Behavioral Research, 28, 111-159. Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In R. H . Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 37-55). Thousand Oaks, CA: Sage. Chou, C. P., Bentler, P. M., & Satorra, A. (1991). Scaled teststatistics and robust standard errors for non-normal data in covariance structure analysis: A Monte Carlo study. British Journal ofMathematical and Statistical Psychology, 44, 347-357. Cliff, N. (1983). Some cautions concerning the application of causal modeling methods. Multivariate Behavioral Research, 18, 115-126. Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997-1003. Cole, D. A., & Maxwell, S. E. (1985). Multitrait-multimethod comparisons across populations: A confirmatory factor analytic approach. Multivariate Behavioral Research, 20, 389-417. Comrey, A. L. (1992). A first course in factor analysis. Hillsdale, NJ; Lawrence Erlbaum Associates. Cudeck, R. (1988). Multiplicative models and MlMM matrices. Journal ofEducational Statistics, 13,131-147. Cudeck, R. (1989). Analysis of correlation matrices using covariance structure models. Psychological Bulletin, 105,317-327. Cudeck, R., & Browne, M. W. (1983). Cross-validation of covariance structures. Multivariate Behavioral Research, 18, 147-167. Cudeck, R., & Henly, S. J. (1991). Model selection in covariance structures analysis and the "problem" of sample size: A clarification. Psychological Bulletin, 109,512-519. Curran, P. J., West, S. G., & Finch,J. F. (1996). The robustness oftest statistics to nonnormality and specification error in confirmatory factor analysis. PsychologicalMethods, 1, 16-29. Dillon, W. R., Kumar, A., & Mulani, N. (1987). Offending estimates in covariance structure analysis: Comments on the causes of and solutions to Heywood cases. Psychological Bulletin, 101,126-135. Fornell, C. (1982). A second generation of multivariate analysis Vol. 1: Methods. New York: Praeger. Francis, D. J., Fletcher, J. M., & Rourke, B. P. (1988). Discriminant validity of lateral sensorimotor tests in children. Journal of Oinical and Experimental Neuropsychology, 10, 779-799.
References
391
Gerbing, D. W., & Anderson, J. C. {1984}. On the meaning of within-factor correlated measurement errors. Journal of Consumer Research, 11, 572-580. Gerbing, D. W., & Anderson, J. C. {1993}. Monte Carlo evaluations of goodness-of-fit indices for structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 40-65). Newbury Park, CA: Sage. Goffin, R. D. {1987}. '!be analysis ofmultitrait-multimethod matrices: An empirical Monte Carlo comparison of two procedures. Unpublished doctoral dissertation, University of Western Ontario, London, Canada. Goffin, R. D. {1993}. A comparison of two new indices for the assessment of fit of structural equation models. Multivariate Behavioral Research, 28, 205-214. Gorsuch, R. L. {1983}. Factor analysis. Hillsdale, NJ: Lawrence Erlbaum Associates. Green, D. R. {1975}. What does it mean to say a test is biased? Education and Urban Society, 8, 33-52. Harter, S. {1985}. Manual for the Self-perception Profile for Children: Revision of the Perceived Competence Scale for Children. Denver, CO: University of Denver. Harter, S. {1990}. Causes, correlates, and the functional role of global self-worth: A lifespan perspective. In R. J. Sternberg & J. Kolligian (Eds.), Competence considered (pp. 67-97). New Haven CT: Yale University Press. Hayduk, L. A. {1987}. Structural equation modeling with LISREL: Essentials and advances. Baltimore: Johns Hopkins University Press. Hoelter,J. W. {1983}. The analysis of covariance structures: Goodness-of-fit indices. Sociological Methods & Research, 11,325-344. Hoyle, R. H. {1995a}. The structural equation modeling approach: Basic concepts and fundamental issues. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 1-15). Thousand Oaks, CA: Sage. Hoyle, R. H. (Ed.). {1995b}. Structural equation modeling: Concepts, issues, and applications. Thousand Oaks, CA: Sage. Hu, L. T., & Bentler, P. M. {1995}. Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: Sage. Hu, L. T., Bentler, P. M., & Kano, Y. {1992}. Can test statistics in covariance structure analysis be trusted? Psychological Bulletin, 112, 351-362. James, L. R., Mulaik, S. A., & Brett,J. M. {1982}. Causal analysis: Assumptions, models, and data. Beverly Hills, CA: Sage. Joreskog, K. G. (1971a). Simultaneous factor analysis in several populations. Psychometrika, 36, 409-426. Joreskog, K. G. (1971b). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109-133. Joreskog, K. G. (1990). New developments in. USREL: Analysis of ordinal variables using polychoric correlations and weighted least squares. Quality and Quantity, 24, 387-404. Joreskog, K. G. {1993}. Testing structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models {294-316}. Newbury Park, CA: Sage. Joeskog, K. G., & Sorbom, D. {1978}. LISREL IV user's guide. Chicago: National Educational Resouces. J oreskog, K. G., & Sorbom, D. {1979}. Advances in factor analysis and structural equation models. New York: University Press of America. Joreskog, K. G., & Sorbom, D. {1988}. LISREL 7: A guide to the program and applications. Chicago: SPSS Inc. Joreskog, K. G., & Sorbom, D. {1989}. LISREL 7 User's reference guide. Chicago: Scientific Software Inc. Joreskog, K. G., & Sorbom, D. {1993a}. LISREL 8: Structural equation modeling with the SIMPLIS command language. Chicago, il.: Scientific Software International. Joreskog, K. G., & Sorbom, D. {1993b}. LISREL 8: User's reference guide. Chicago: Scientific Software International.
392
References
Joreskog, K. G., & Sorbom, D. (1993c). PRELIS 2: User's reference guide. Chicago: Scientific Software International. Joreskog, K. G., & Sorbom, D. (1996a). LISREL 8: User's reference guide. Chicago: Scientific Software International. Joreskog, K. G., & Sorbom, D. (1996b). PRELIS: User's reference guide. Chicago: Scientific Software International. Joreskog, K. G., & Sorbom, D. (1996c, April). Structural equation modeling. Workshop presented for the NORC Social Science Research Professional Development Training Sessions, Chicago. Joreskog, K. G., & Yang, F. (1996). Nonlinear structural equation models: The Kenny-Judd model with interaction effects. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 57-88). Mahwah, NJ: Lawrence Erlbaum Associates. Kenny, D. A. (1976). An empirical application of confirmatory factor analysis to the multitraitmultimethod matrix. Journal ofExperimental Social Psychology, 12, 247-252. Kenny, D. A. (1979). Correlation and causality. New York: Wiley. Kenny, D. A., & Kashy, D. A. (1992). Analysis of the multitrait-multimethod matrix by confirmatory factor analysis. Psychological Bulletin, 112, 165-172. Kerlinger, F. N. (1984). Liberalism and conservatism: The nature and structure ofsocial attitudes. Hillsdale, NJ: Lawrence Erlbaum Associates. Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746-759. La Du, T. J., & Tanaka, J. S. (1989). Influence of sample size, estimation method, and model specification on goodness-of-fit assessments in structural equation modeling. Journal of Applied Psychology, 74, 625-636. Lee, S. Y., & Bentler, P. M. (1980). Some asymptotic properties of constrained generalized least squares estimation in covariance structure models. South African Statistical Journal, 14, 121-136. Leiter, M. P. (1991). Coping patterns as predictors of burnout: The: functions of control and escapist coping patterns. Journal of Organizational Behavior, 12, 123-144. Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley. Loehlin,J .C. (1992). Latent variable models: An introduction to factor, path, & structural analyses. Hillsdale, NJ: Lawrence Erlbaum Associates. Long, J. S. (1983a). Confirmatory factor analysis. Beverly Hills, CA: Sage. Long, J. S. (1983b). Covariance structure models: An introduction to LISREL. Beverly Hills, CA: Sage. MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Psychological
Bulletin, 100, 107-120.
MacCallum, R. C. (1995). Model specification: Procedures, strategies, and related issues. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76-99). Newbury Park, CA: Sage. MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, 130-149. MacCallum, R. C., Roznowski, M., Mar, M., & Reith, J. V. (1994). Alternative strategies for cross-validation of covariance structure models. Multivariate Behavioral Research, 29, 1-32. MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111,490-504. MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological Bulletin, 114, 185-199. Marcoulides, G. A., & Schumacker, R. E. (Eds.). (1996). A dvanced structural equation modeling: Issues and techniques. Mahwah, NJ: Lawrence Erlbaum Associates.
References
393
Marsh, H. W. (1987). The hierarchical structure of self-concept and the application of hierarchical confirmatory factor analyses. Journal ofEducational Measurement, 24, 17-39. Marsh, H. W. (1988). Multitrait-multimethod analyses. In J. P. Keeves (Ed.), Educational research methodology, measurement, and evaluation: An international handbook (pp. 570-578). Oxford, England: Pergamon. Marsh, H. W. (1989). Confirmatory factor analyses of multitrait-multimethod data: Many problems and a few solutions. Applied Psychological Measurement, 15,47-70. Marsh, H. W. (1992a). SelfDescription Questionnaire (SDQ) L· A theoretical and empirical basis
for the measurement of multiple dimensions of preadolescent self-concept: A test manual and research monograph. Macarthur, Australia: Faculty of Education, University of Western
Sydney. Marsh, H. W. (1992b). SelfDescription Questionnaire (SDQ) III: A theoretical and empirical basis
for the measurement of multiple dimensions of late adolescent self-concept: An interim test manual and research monograph. Macarthur, Australia: Faculty of Education, University of
Western Sydney. Marsh, H. W., & Bailey, M. (1991). Confirmatory factor analyses of multitrait-multimethod data: A comparison of alternative models. Applied Psychological Measurement, 15, 47-70. Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effect of sample size. Psychological Bulletin, 103, 391-410. Marsh, H. W., & Grayson, D. (1995). Latent variable models of multitrait-multimethod data. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 177-198). Thousand Oaks, CA: Sage. Marsh, H. W., & Hocevar, D. (1985). The application of confirmatory factor analysis to the study of self-concept: First and higher order factor structures and their invariance across age groups. Psychological Bulletin, 97, 562-582. Maslach, C., & Jackson, S. E. (1981). Maslach Burnout Inventory manual. Palo Alto, CA: Consulting Psychologists Press. Maslach, C., & Jackson, S. E. (1986). Maslach Burnout Inventory manual (2nd ed.). Palo Alto, CA: Consulting Psychologists Press. McCarthy, M. (1990). The thin ideal, depression, and eating disorders in women. Behavior Research and Therapy, 28, 205-215. McDonald, R. P. (1985). Factor analysis and related methods. Hillsdale, NJ: Lawrence Erlbaum Associates. McDonald, R. P., & Marsh, H. W. (1990). Choosing a multivariate model: Noncentrality and goodness-of-fit. Psychological Bulletin, 107,247-255. Mislevy, R. J. (1986). Recent developments in the factor analysis of categorical variables. Journal ofEducational Statistics, 11, 3-31. Mueller, R. O. (1996). Basic principles ofstructural equation modeling: An introduction to LISREL and EQS. New York: Springer-Verlag. Mulaik, S. A. (1972). 1befoundations offactoranalysis. New York: McGraw-Hili. Mulaik, S. A., James, L. R., Van A1tine, J., Bennett, N., Lind, S., & Stilwell, C. D. (1989). Evaluation of goodness-of-fit indices for structural equation models. Psychological Bulletin,
105, 430-445.
Muthen, B. O. (1984). A general structural equation modd with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115-132. Muthen, B. O. (1988). LISCOMP: Analysis of linear structural equations using a comprehensive measurement model: User's guide. Chicago: Scientific Software International. Muthen, B., & Christoffersson, A. (1981). Simultaneous factor analysis of dichotomous variables in several groups. Psychometrika, 46, 407-419. Muthen, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. BritishJournal ofMathematical and Statistical Psychology, 38, 171-189.
394
References
Petersen, A. C., & Taylor, B. (1980). The biological approach to adolescence. In J. Adelson (Ed.), Handbook ofadolescent psychology (pp. 117-155). New York: Wiley. Pettegrew, L. S., & Wolf, G. E. (1982). Validating measures of teacher stress. American Educational Research Journal, 19,373-396. Rindskopf, D. (1984). Structural equation models: Empirical identification, Heywood cases, and related problems. Sociological Methods and Research, 13, 109-119. Rindskopf, D., & Rose, T. (1988). Some theory and applications of confirmatory second·order factor analysis. Multivariate Behavioral Research, 23, 51-67. Rock, D. A, Werts, C. E., & Flaugher, R. L. (1978). The use of covariance structures for comparing the psychometric properties of multiple variables across populations. Multivariate Behavioral Research, 13, 403-418. Rozeboom, W. W. (1960). The fallacy of the null hypothesis significance test. Psychological Bulletin, 57,416-428. Saris, W. E., Satorra, A., & Sorbom, D. (1987). The detection and correction of specification errors in structural equation models. In C. Clogg (Ed.), Sociological methodology 1987 (pp. 105-130). San Francisco: Jossey-Bass. Saris, W. E., & Stronkhorst, H. (1984). Causal modelling: nonexperimental research: An introduction to the LISREL approach. Amsterdam: Sociometric Research Foundation. SAS Institute Inc. (1992). The CALIS procedure extended user's guide. Cary, NC: Author. Satorra, A., & Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. In American Statistical Association 1988 proceeding,5 of the business and economics section (pp. 308-313). Alexandria, VA: American Statistical Association. Satorra, A, & Saris, W. E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83-90. Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods, 1, 115-129. Schmitt, N., & Stults, D. M. (1986). Methodology review: Analysis of multitrait-multimethod matrices. Applied Psychological Measurement, 10, 1-22. Schumacker, R. E., & Lomax, R. G. (1996). A beginner's guide to structural equation modeling. Mahwah, NJ: Lawrence Erlbaum Associates. Shavelson, R. J., Hubner, J. J., & Stanton, G. C. (1976). Self-concept: Validation of construct interpretations. Review ofEducational Research, 46, 407-441. Soares, AT., & Soares, L. M. (1979). The Affective Perception Inventory: Advanced level. Trumbell, CT: ALSO. Sobel, M. F., & Bohmstedt, G. W. (1985). Use of null models in evaluating the fit of covariance structure models. In N. B. Tuma (Ed.), Sociological methodology 1985 (pp. 152-178). San Francisco: J ossey-Bass. Sorbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27, 229-239. Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate Behavioral Research, 25,173-180. Steiger, J. H. (1994). Structural equation modeling with SePath: Technical documentation. Tulsa, OK: STATSOFT. Steiger, J. H., & Lind,J. C. (1980, June). Statistically based tests for the number ofcommon factors. Paper presented at the Psychometric Society Annual Meeting, Iowa City, lA. Stine, R. A (1989). An introduction to bootstrap methods: Examples and ideas. Sociological Methods and Research, 8, 243-291. Striegel-Moore, S. H., Silberstein, L. R., & Rodin, R. (1986). Towards an understanding of risk factors for bulimia. American Psychologist, 41, 246-263. Sugawara, H. M., & MacCallum, R. C. (1993). Effect of estimation method on incremental fit indexes for covariance structure models. Applied Psychological Measurement, 17, 365-377.
References
395
Tanaka, J. S. (1993). Multifaceted conceptions of fit in structural equation models. In J. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 10-39). Newbury Park, CA: Sage. Tanaka, J. S., & Huba, G. J. (1984). Confirmatory hierarchical factor analyses of psychological distress measures. Journal o/Personality and Social Psychology, 46, 621-635. Thompson, B. (1995). Exploring the replicability of a study's results: Bootstrap statistics for the multivariate case. Educational and Psychological Measuremem, 55, 84-94. Thompson, B. (1996). AERA editorial policies regarding statistical significance testing: Three suggested reforms. Educational Researcher, 25, 26-30. Thompson, B. (1997). The importance of structure coefficients in structural equation modeling confirmatory factor analysis. Educational and Psychological Measurement, 57, 5-19. Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores: A historical overview and some guidelines. Educational and Psychological Measure· ment, 56, 197-208. Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions o/the American Mathematical Society, 54, 426-482. Werts, C. E., Rock, D. A., Linn, R. L., & Joreskog, K. G. (1976). Comparison of correlations, variances, covariances, and regression weights with or without measurement error. Psycho· logical Bulletin, 83, 1007-1013. West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 56-75). Thousand Oaks, CA: Sage. Wheaton, B. (1987). Assessment of fit in overidentified models with latent variables. Sociological Methods & Research, 16, 118-154. Wheaton, B., Muthen, B., Alwin, D. F., & Summers, G. F. (1977). Assessing reliability and stability in panel models. In D. R. Heise (Ed.), Sociological methodology 1977 (pp. 84-136). San Francisco: Jossey-Bass. Widaman, K.F. (1985). Hierarchically tested covariance structure models for multitraitmultimethod data. Applied Psychological Measurement, 9, 1-26. Williams, L. J., & Holahan, P.J. (1994). Parsimony-based fit indices for multiple-indicator models: Do they work? Structural Equation Modeling, 1, 161-189. Wothke, W. (1987, April). Multivariate linear models o/the multitrait-multimethod matrix. Paper presented at the American Educational Research Association Annual Meeting, Washington, DC. Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K. A. Bollen & J .S. Long (Eds.), Testing structural equation models (pp. 256-293). Newbury Park, CA: Sage. Wothke, W. (1996). Models for multitrait-multimethod matrix analysis. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advancedstructural equation modeling: Issues and techniques (pp. 7-56). Mahwah, NJ: Lawrence Erlbaum Associates. Yung, Y. F., & Bentler, P. M. (1996). Bootstrapping techniques in analysis of mean and covariance structures. In G. A. Marcoulides & R. E. Schumacker (Eds.), A dvanced structural equation modeling: Issues and techniques (pp. 195-226). Mahwah, NJ: Lawrence Erlbaum Associates.
Author Index A Abramson, L. Y., 184 Aish, A. M., 147 Aitchison, J., 252 Akaike, H ., 115 Aldrich,]. H., 73 Alloy, L. B., 184 Alwin, D. F., 111 Anderson,]. c., 20, 94,112, 118, 142, 147, 148,158,290,327 Arbuckle, J. L., 9 Atkinson, L., 137 Austin,J. T., 9,118
B Babakus, E., 99, 137 Bacharach, S. B., 236 Bagozzi, R. P., 194, 214 Bailey,M., 194,204,220,221 Balev, ] ., 163,328 Balla, J. R, 112 Bandalos, D. L., 115, 118, 161,328 Barb, K. H ., 100 Baron, P., 163,327,328 Bauer, S. C., 236 Bazana, P. G., 194, 199 Beck, A. T., 163 Benson,] ., 296, 328 Bender, P. M., 3, 4, 9,14,17,20,43,69,99, 103,109,111,112,114,115,116,117, 118, 119, 122, 137, 142, 151, 165, 172, 175,177,244,252,261,262,303,304, 305,309,312 Biddle, B. ] ., 157 Bohrnstedt, G. W., 116, 119 Bollen, K. A., 6, 9,17,19,28,40,60,61,100, 103,104,110,111,117,118,151,169, 175,267,338
Bonen, D. G., 109, 114, 117 Boomsma, A., 95, 99,103,175 Bozdogan, H ., 115 Breckler, S. ].,99, 157 Bren,]. M., 116 Brookover, W. B., 310 Browne, M. W., 9, 79, 99, 107, 109, 111, 112, 113,114,116,118,148,157,158,159, 160, 161, 194 Byrne, B. M., 9, 91, 92, 99, 113, 117, 129, 135, 136,137,147,150,157, 163, 170,194, 199,216,231,236,241,252,260, 261, 263,269,274,281,287,290,303,310, 321,327,328,335,338,345,352
c Calderon, R F., 9, 118 Campbell, D. T., 193,213,216,227 Campbell, T. L., 163 Carlson, M., 117 Chou, C. P., 20, 99,119,137, 153,175, 177 Christoffersson, A., 274 Cliff, N., 109, 147, 152, 157 Cohen, ]., 107, 108 Cole, D. A., 297 Comrey, A. L., 6 Conley, S., 236 Coward, M., 9 Cudeck, R., 50, 95,107,109,111,112,113, 114,116,118,147,157,158,159,160, 161,227,327 Curran, P.]., 98,118,196
D Daniel, L. G., 6 Dillon, W., 175, 178 397
398
Author Index
E Erbaugh,]., 163
F Ferguson, C. E.,] r., 99, 137 Finch, J. F., 98, 118 Fiske, D. W., 193,213,216,227 Flaughter, R. L., 297 Fletcher,]. M., 328 Fornell, c., 3 Francis. D. J., 328
G Gerbing, D. W., 20,94,112,118, 142, 147, 148,158,291,327 Goffin, R. D., 117, 142, 194,216,220 Gorsuch, R. L., 6 Grayson, D., 194,204,218,220,227
H Harter, S., 20, 92 Hayduk, L. A., 6, 9,17,28,60,68 Henly, S. ].,111,158,159,160,327 Hocevar, D., 290 Hoeiter,]. W., 118 Holahan, P. J., 116, 117, 118 Hoyle, R. H., 9, 60,141 Hu, L. T., 99,111,112,115,116,118 Huba, G.]., 157, 163, 170,290 Hubner,].J.,91
J Jackson, S. E., 135, 136-137,238 James, L. R., 116, 117 ]6reskog, K. G., 4, 8,10, 17,30,43,45,50,57, 60,61,67,68,69,73,77,81,94,95,99, 103,104,110,111,113,115,116,118, 119,121,125,126,137,143,145,147, 150,157,164, 165, 166, 175, 180, 181, 184,202,260,261,262,281,296,297, 304,306,307,309,317,345,352,355, 362,377,380,384
K Kano, Y., 99 Kaplan, D., 99, 137
Kashy, D. A., 194,202,220,221,222-223 Kenny, D. A., 194,202,218,220,221,222-223 Kerlinger, F. N., 19 Kirk, R. E., 108 Kumar, A., 176
L La Du, T. J., 118 Larsson, B., 163 Lee, S. Y., 252 Leiter, M. P., 233 Lind,J. c., 109, 112 Linn, R. L., 262 Little, R. J. A., 77 Loehlin, J. c., 9 Lomax, R. G., 9 Long,]. S., 6, 9,17,28
M MacCallum, R. c., 8, 9, 107, 109, 110, 111, 112, 113, 11~ 125, 126, 141, 14~15~ 15~ 159, 161,291,327 Mar,M., 109 Marcoulides, G. A., 9 Marlin, M. M., 157 Marsh, H. W., 18,20,112,116,117,118,193, 194,202,204,218,220,221,227,266, 287,290 Maslach, c., 135, 136, 137,238 Maxwell, S. E., 297 McCarthy, M., 180, 185 McDonald, R. P., 6, 112, 117 Melin, L., 163 Mels, G., 9 Mendelson, M., 163 Metalsky, G. I., 184 Mislevy, R. J., 99 Mock,]., 163 Mueller, R. 0.,9,87, 118 Mulaik, S. A., 6,113,116,117, 118 Mulani, N., 176 Muthen, B. 0., 9, 99, 111, 137, 157,261,274, 295
N Necowitz, L. B., 8, 9 Nelson, F. D., 73
399
Author Index
p Peterson, A. c., 180 Pettegrew, L. S., 238
R Reith,J. V., 109 Rindskopf, D., 34, 173, 175, 189 Rock, D. A., 262, 297 Rodin, R., 180 Rose, T., 34, 173, 189 Rourke, B. P., 328 Rozeboom, W. W., 108 Roznowiski, M., 8,9,109 Rubin, D. B., 77
s Saris, W. E., 9, 17,28, 111, 122,374,375 SAS Institute, 9 Satorra, A., 99, 111, 122 Schmidt, F. L., 108 Schmitt, N., 193,201 Schumacker, R. E., 9 Shavelson, R. J., 91, 157,260,263,274,281 Silberstein, L. R., 180 Silvey, D. c., 252 Soares, A. T., 263, 310 Soares, L. M., 263, 310 Sobel, M. F., 116, 119 Sarhom, D., 4,10,17,30,43,45,50,57,60, 61,67,68,69,73,77,81,95,103,110, 111,113,115,116, 118, 121, 122, 126, 137,143,164, 165, 166, 175, 185,202,
262,281,306,307,309,317,345,352, 355,362,377,384 Stanton, G. c., 91 Steiger, J. H., 9, 107, 109, 111, 112 Stine, R. A., 69 Striegel-Moore, S. H., 180 Stronkhurst, H., 9, 17,28 Stults, D. M., 193, 201 Sugawara, H. M., 107, 118, 141 Summers, G. F., 111
T Tanaka,J. S., 112, 118, 157, 163, 170,290 Taylor, B., 180 Thompson, B., 6, 69,108
w Wald, A., 252 Ward, C. H., 163 Werts, C. E., 262, 297 West, S. G., 98, 99, 118 Wheaton, B., Ill, 118, 119, 125 Widaman, K. F., 194, 199,201,204,213,214 Williams, L. J., 116, 117, 118 Wolf, G. E., 238 Worth Gavin, D. A., 91,129,216 Wothke,W., 20, 99, 174, 176, 194,202,220 Wu, E. J. c., 99
y Yang, F., 304 Yi, Y., 194,214 Yung, Y. F., 69
Subject Index A a*"·, (PRELIS), 75 AC, see Asymptotic covariance (AC) AD, see Admissibility (AD) ADF, see Asymptotic distribution-free (ADF) estimator Admissibility check (AD), 62, 175,355 checkoff (AD=OFF), 62, 175,201,355 and errors, 66-68 Affective Perception Inventory, 263, 310 AGFI, 107, 115-116, 117 AIC, see Akaike's Infonnation Criterion (AIC) Akaike's Information Criterion (AIC), 115, 142 AL, see Print all output (AL) ALL option LISREL, H PRELIS,73 All X-models, 18,20-24 All Y-models, 18,24-26,36 Alternative models (AM), 8 AM, see Augmented moment matrix (AM) AM Goreskog), see Alternative models (AM)
American Psychologist, 108
AMOS, 9 Arrows, 14-16,33 Assumptions (LISREL), 12,306 Asteriks (".) LISREL,48 PRELIS, 75, 131 SIMPLIS,85 Asymptotic covariance (AC), 189 LISREL,61,68,172 PRELIS, 79-80, 166 SIMPLIS,83 Asymptotic distribution-free (ADF) estimator, 99
Asymptotic variance (AV), 61 Augmented moment matrix (AM), 46, 77 AV, see Asymptotic variance (AV)
B B, see Beta matrix (B) Baseline models, 262, 263, 266, 288-291 Basic model, 356, 375 BDI, see Beck Depression Inventory (BDI) Beck Depression Inventory (BDI), 163-164 Behavior, defined, 4 Beta matrix (B), 12, 39-40, 239 Bias, 73, 193 Bidirectional arrow, 15 Biserial (PRELIS), 79 Biserial correlations, 73, 164, 166 Blanks (LISREL), 47, 48, 56, 66 Board of Scientific Affairs for the American Psychological Association, 108 Bootstrap sampling (PRELIS), 69, 76 British Journal of Mathematical and Statistical Psychology, 9
Burnout,231-233,237-238,244,247,250
c c, see Continuation CA, see Censored variables from above (CA) CAlC, see Consistent version of Akaike's Information Criterion (CAlC) Calibration/validation samples, 330-337, 339 CALIS, 9 Casecondition, 76 Case selection (SC) (PRELIS), 76 Case sensitivity, H Categorical data, analysis of, 99-100, 164-170 Categorical labels (CL) (PRELIS),71-72
401
402 Causal model, 14 Causal predominance, testing for, 345 correlated error model, 368 goodness-of-fit statistics, 365 hypothesized model, 351-353 input file, 352 output file, 353-381 structural model, 372-375 using SIMPLIS, 381-385 Causal structures invariance testing, 327-328 across calibration/validation samples, 330-337 hypothesized model, 328-330 using SIMPLIS, 338-341 validity testing (LISREL), 231-233, 238-239 confirmatory factor analyses, 236, 238 formulation of indicator variables, 236 input file, 239, 241 output file, 241-242 post hoc analyses, 244-253 preliminary analyses, 234 preliminary analyses using PRELIS, 233-234 validity testing (SIMPLIS), 253-255 Cause, 14 CB, see Censored variables from below (CB) CE, see Censored variables from above and below (CE) Ceiling effect, 69 Censored data, 69 Censored variables from above (CA), 73 Censored variables from above and below (CE),73 Censored variables from below (CB), 73 CFA, see Confirmatory factor analyses (CFA) CFI, see Comparative Fit Index, 109-112 Chi square (x.2), 107, 109-112, 114 Chi square/degrees of freedom ratio, 111-112 CI, inability to compute, 142 CL, see Categorical labels (CL) Classroom Environment Scale, 236 CM, see Covariance matrix (CM) CN, see Critical N (CN) CO, see Continuous (CO) variables CO ALL (PRELIS), 131, 140, 196 Commas LISREL,47,56 PRELIS, 71
Subject Index
Comments (LISREL), 44 Comparative Fit Index (CFI), 114, 117, 142, 242 Complete latent variable (LV) model, see Full latent variable (LV) model Completely standardized solution, 62, 152-153 Composite Direct Product model, 194,227 Confidence intervals, 109, Ill, 112-113 Confirmatory factor analyses (CFA), 5, 6, 17-19,236,238 first-order factor models, 18-19,26--28 all X-models, 20-24, 26 all Y-models, 24-26 compared with higher-order levels, 31-37 example of LISREL input file, 62-63 example of SIMPLIS input file, 84-85 general models, 194 of measuring instruments, 136, 163 second-orderfactor models, 18-19, 31-37 example of LISREL input file, 63-64 example of SIMPLIS input file, 85-86 Congeneric sets, 94 Consistent version of Akaike's Information Criterion (CAlC), 115, 143 Constant variables, 306, 308, 314 Constraints, 33, 55, 97 equality, 272-281 Construct validity testing, multitrait-multimethod models, 193-194 CFA approach to, 199 comparison of models, 213-215 Correlated Uniqueness model, 218-227 examination of parameters, 215-218 input files, 200-202 output files, 202-204,207-209 preliminary analysis using PRELIS, 194-199 using SIMPLIS, 227-229 Continuation ("c") (LISREL), 44 Continuous (CO) variables, 73, 74, 98-100 165,383 Control lines LISREL,45 PRELIS,70 Convergent validity, 193, 199,213,216,221 Correlated errors, 24, see also Error terms Correlated errors model, in Windows, 362
403
Subject Index Correlated methods, 199,200,201,207, 215-216 in Correlated Uniqueness model, 222-223,227 freely, 207-208 Correlated traits, 199,200,201,215-216 absence of, 207 different, 214-215 freely, 209 perfectly, 207-208 same, 213 Correlated Uniqueness model, 194,202,204, 218-227 Correlation, representation of, 15 Correlation coefficients, 99 Correlation matrix (KM), 77,165-166 LISREL, 46 SIMPLIS, 83, 189 Covariance matrix (CM), 46, 47,140 (LISREL) automatic computation of, 50 Covariances, 17,23-26,33 representation of, 15 Covariance structure modeling, cross-validation in, 156-161 Covariance structures, 304 CPU, default TM, 62 Criterion variables, see Endogenous Critical N (CN), 118 Cross-validation, see also Causal structures, invariance testing in covariance structure modeling, 156-161 Cross-Validation Index (CDI), 159, 160-161 CVI, see Cross-Validation Index (CVI)
D DA, see Data specification (DA) Data censored, 69 external files, 72 in model fitting equations, 7-8 Data errors, 67-69 Data manipulation (PRELIS), 70,74-80 Data management, and PRELIS, 69 Data parameters (DA) , see Data specification (DA) Data screening (PRELIS), 69,194-199, 233-234
Data specification (DA) LISREL, 44, 45, 95,139-141,173,268 and errors, 66 optional input, 49-51 required input, 46-49 PRELIS, 70-71,140,239 Davidon-Fletcher-Powell algorithm, 380 Decimal places, number of (ND) LISREL, 62 PRELIS, 78, 131 Default values, 53-54, 155 Degrees of freedom, 111 Dependent variables, see Endogenous variables DI, see Diagonal matrix (DI) Diagonal elements, 24, 97, 174 Diagonal matrix (DI), 52-53, 97, 200 Diagonally weighted least squared (DWLS), 60-61 Dichotomous variables, 73 Differential cross-loading, 281-283 in SIMPLIS, 283-284 DI,FR (LISREL), 97, 98, 200 Discriminant validity, 193, 199,214-215,216, 217,221 Disturbance terms, see also Error terms; Residual error terms defined,11 in full models, 37 in second-order models, 35-36 symbols for, 14, 15 Division, prohibited in PRELIS, 75 Double subscripts, 23 Dummy variables, see Constant variables DWLS, see Diagonally weighted least squared (DWLS)
E ECVI, see Expected Cross-Validation Index (ECVI) Edit option, windows, 347 Educators' Survey (MBI Form Ed), see Maslach Burnout Inventory (MBI), FormEd EF, see Total effects (EF) EFA, see Exploratory factor analysis Elements, representation of, 11 END OF PROBLEM (SIMPLIS), 84, 133 Endogenous variables, 10, 11,238-239
404
all-Y models, 18 variances and covariances of, 17 EPC statistics, 179, 181 EQ, see Equal to (EQ) EQS, 9, 14, 17, 175,241 compared with LISREL, 99, 252-253 Equality constraints, 272-281 Equal sign (=) (LISREL), 66 Equal to (EQ), 55, 172,272-273,379 EQUATION statement (SIMPLIS), 84, 87, 189,253,284,285 Error covariances, see Correlated errors Error messages LISREL, 51, 60, 65-69,173-175,190-191 SIMPLIS, 284-285 Errors data, 67-69 standard, see Standard errors syntax,66 systematic, 148 Type I or 11,158 Error terms, 14, 15,20,33,38,98,125-126, 150-152,180, see also Disturbance terms; Measurement terms representation of, 24 in Theta Delta matrix, 220 Error variance-covariance matrix, 97 Estimation, 109,269, see also Parameter estimation biased,73 negative, 60 negative variance estimates, 174-175 ETA,52 Examination of parameters, testing for construct validity, 215-218 Exclamation mark (!) LISREL,44,46,95 PRELIS,70 SIMPLIS,83 Exogenous variables, 10, 11,238-239 all X-models, 18,23 variances and covariances of, 17, 23-24 Expected Cross-Validation Index (ECVI), 107,113-114,115,142 Expected Parameter Change statistics, 242, 244 Expected Parameter Change value, 122 Exploratory factor analysis (EFA), 5-6 Exponentiation (a*"), PRELIS, 75 External data files, 72
Subject Index
F Factor analysis, types of, 5-6, see also Confirmatory factor analyses Factor correlations, 215-216 Factorial validity of measuring instruments testing (first-order model) (LISREL), 135-136 input file, 139-141, 148, 151-152 model assessment, 141-142 output file, 141, 148-150, 151-153 preliminary analysis using PRE LIS, 137,140 testing (first-order model) (SIMPLIS), 156-161 Factorial validity of theoretical constructs testing (first-order model) (LISREL) hypothesized models of self-concept (Sq,91-92 four-factor structure, 92-94, 109-111 one-factor structure, 128-129 two-factor structure, 126-128 input files, 94-100 output files, 100-126 testing (first-order model) (SIMPLIS) input file, 132-133 output file, 133-134 preliminary analyses using PRELIS, 130-132 Factor loadings, 6, 215-216 Factor-loading matrix, 23 Factors, see Latent variables Factor scores regression (FS), 61 FI, see Fixed parameters (FI) File extension names PRELIS, 80--81 Windows, 346-347 File menu, Windows, 346 First-order factor models, see Confirmatory factor analyses (CFA), first-order factor models Fisher's Scoring algorithm, 380 Fit models, see Model fitting Fit statistics, Windows, 358 Fixed format (FO) LISREL, 47-48, 95, 97 PRELIS, 72 SIMPLIS, 83, 228 Fixed parameters (FI), 53, 54, 55-59,141, 200,307
405
Subject Index
noncentrality parameter (NCP), 111 value assignment for, 59-60 Floor effect, 69 FO, see Fixed format (FO) Font option, Windows, 356 Formatting, Windows, 348 FORTRAN statement, 48, 49, 95 PRELIS,131 SIMPLIS, 285 FR, see Free parameters (FR) Free format, 47, 48-49 Free parameters (FR), 53, 54, 55-59, 97,141, 200 Frequency distributions, 121 FROM filename. ext (SIMPLIS), 83 FS, see Factor scores regression (FS) FU, see Full matrix (FU) Full latent variable (LV) model, 7 invariant causal structure, see Causal structure, invariance testing validity testing, see Causal structures, validity testing Full matrix (FU), 23, 35, 47, 52-53, 54, 97, 141 Full structural equation model, 37-41 example of LISREL input file, 64-65 example of SIMPLIS input file, 86
G Gamma matrix, 12, 40, 239, 242 Generalized least squares (GLS), 60-61 Getting started, Windows, 350 GF,143 GFI, see Goodness-of-Fit Index (GFI) GLS, see Generalized least squares (GLS) Goodness-of-fit,7, 103-119, 141-142, 179, 185 in causal predominance testing, 365 in Windows, 358 Goodness-of-Fit Index (GFI), 107, 115-116, 117 Greek letters, 10-13 GROUP, 281-282
H Help menu, Windows, 350 Heywood cases, 176, see also Negative estimates
Higher-order factors, see Second-order factors
ID, see Identity matrix (ID) Identification, see Statistical identification Identity matrix (ID), 52, 53 IFI, see Incremental Index of Fit (IFI) IM, see Imputation of missing value lines (IM) Imputation of missing value lines (IM), 76-77,379-380 IN, see Invariant (IN) Inadmissible values, 175 Incremental Index of Fit (IFI), 117 Independence model, 114-115 Independent measures, 213-215 Independent variables, see Exogenous variables Indicators, 4, 14, 20 formulation of, 236 Indicesoffit, 109, 111,l14-118,seealso Modification indices (MI) Initial point, see Threshold values Instrumental variables method (IV), 60-61 Intercept values, 17n, 306, 308-309, 312-314 Intermediate solution, 173, 355 Invariance testing causal structures, see Causal structures, invariance testing theoretical constructs, see Theoretical constructs, testing for invariant factorial structure Invariant (IN), 307 Invariant testing,latent mean structures, see Latent mean structures, testing for invariant IV, see Instrumental variables method (IV)
J Joumal ofEducational Statistics, 9 Just-identified models, 29
K Kind button, Windows, 349, 357, 358, 360, 365 KM, see Correlation matrix (KM) KSI, 52,100 Kurtosis, 169, 196, 198
406
Subject Index
L LA, see Labels (LA) Labels (LA) LISFlEL,47,50,54-55,95,98 PFlELIS,70,71-72,131 SIMPUS,83 Lagrange Multiplier Test (LMTEST), 252, 328 Lambda-X (A.) matrix, 56, 58, 97, 140, 148 Latent mean structures conceptual framework, 304-306 estimation of latent variable means, 308-309 LISFlEL conceptual framework, 306-308 testing for invariant, 303 baseline models, 310 hypothesized model, 309-310 inputfile, 315-317 output file, 317-321 structured means model, 310-314 using SIMPUS, 322-324 Latent variables (LV), 4-5 continuous, 99 endogenous, 10, 11,238-239 exogenous, 10, 11,23 full,7 labeling, 54-55 levels of, 33-35 model specification for, 52 scale determination, 29-30, 57 SIMPUS, 84, 87,284 standardization of, 30 LET command, 191,229 Likelihood Ration Test statistic, 109-110 Limited information estimators, 99 LISCOMP,9 LISFlEL, 9-10, 43 basic composition of, 10-13 compared with PFlELIS, 131 compared with SIMPUS, 81-82,132-134 input files, 44-62 examples of, 62--65, 94-100 output files, 65--66, 100-119 Windows version, see Windows LISFlEL 8 icon, 346 Listwise deletion, 71, 75, 76--77 LK keyword, 98, 141 A matrix, see Factor-loading matrix LX matrix, see Lambda-X (A.) matrix
M MO, see Model specification (MO) MA, see Matrix to be analyzed (MA) Manifest variables, see Observed variables Masks, Windows, 348 Maslach Burnout Inventory (MBI) described, 135, 136-137,238 FormEd,137 testing factorial validity of, 135-137 Matrices defined,11 full,23,35,47,52-53 model specification for, 52-53 Psi, 25, 36 regression, 35 representation of, 11-13 variance-