The Structural Equation Models (SEM) platform enables you to fit a wide variety of models that can be used to test theories of relationships among variables. The variables in the models can be observed (manifest variables) or unobserved (latent variables). Structural equation modeling is popular in the social and behavioral sciences.
By default, the platform specifies a model with means and variances for all variables. The platform then provides a model-building interface that enables you to see multiple views of the model while it is being built. It also provides some model details during the model construction process that alert you to untenable models prior to running the model.
After you fit one or more models, you can compare the fitted models and two baseline models in the Model Comparison report. The baseline models are an unrestricted model and an independence model. The unrestricted model is a fully saturated model, which fits all means, variances, and covariances of the specified Model Variables without imposing any structure on the data. The independence model fits all means and variances of the specified Model Variables. All covariances among the specified Model Variables are fixed to zero, which leads to a highly restrictive model.
The SEM platform uses the full information maximum likelihood (Finkbeiner 1979) method. This enables you to fully use all available information from the data even when there is a high proportion of observations with random missing values.
For more information about structural equation modeling, see the CALIS Procedure chapter in SAS Institute Inc. (2020a), Bollen (1989), and Kline (2016).
Note: All models in the Structural Equation Models platform are estimated with a mean structure, which means that a Constant term is included. If you do not want to place a structure on the means of the observed variables, then the means should be freely estimated as in the default model specification.
This section describes some of the various types of models that can be fit in the Structural Equation Models platform:
• Path Analysis enables you to test alternative explanatory models of the associations between observed variables. This technique is often used when only one variable is available per construct of interest in a study. Perhaps the simplest Path Analysis model is a standard regression model, in which X predicts Y. The SEM platform enables you to fit this simple regression model but you can also specify more interesting models as well. For example, you might have variable Z that is presumed, based on theory or previous research, to be a mediator of the X ⇒ Y relationship. In other words, X predicts Z, which then predicts Y. Thus, the original X ⇒ Y relationship might exist only because Z is excluded in the original model. Path analysis can be carried out in the SEM platform by performing the following steps:
1. Select all the observed variables in the launch window, click Model Variables, and click OK.
2. In the Model Specification report, select the predictors in the From List and the corresponding outcomes in the To List, and click the unidirectional arrow button.
Note: All exogenous variables (those that do not have any unidirectional arrows pointing at them) must be freely correlated in the model, unless a hypothesis of zero correlation is being tested. These covariances are specified with the bidirectional arrow button.
• Confirmatory Factor Analysis (CFA) enables you to test alternative measurement models. CFA is often used in survey development and as an initial step prior to fitting structural regression models. The SEM platform enables fitting of confirmatory factor analysis models by performing the following steps:
1. Select all the observed variables in the launch window, click Model Variables, and click OK.
2. Using the To List under Model Specification, select the variables that are presumed to load onto a latent variable.
3. Enter the name for the latent variable in the box below the To List, and click the add latent button.
4. Repeat this process until all the latent variables for the model have been specified.
Note that the SEM platform always includes a mean structure, so all of the observed variables are listed in the Means/Intercepts list as outcomes of the Constant term. Moreover, all latent variables are automatically identified by setting the loading of their first indicator to 1 (default) or their variance to 1 if the Standardized Latent Variables option was selected in the launch window. Finally, the traditional CFA model allows all latent variables to covary. You can specify these covariances by selecting all the latent variables in the From and To Lists and then clicking the bidirectional arrow button.
• Structural Regression (SR) models are also known as path analysis with latent variables. These models are often used after having identified an appropriate measurement model through confirmatory factor analysis (CFA). SR models enable you to test specific patterns of relationships between latent variables. In other words, while CFA does not impose any directionality in the effects between latent variables (all latent variables are allowed to freely covary), SR models do. In an example where management Leadership is hypothesized to lead to less team Conflict and more employee Satisfaction in the workplace, the Leadership latent variable can predict the Conflict and Satisfaction latent variables. You can specify these directional effects (regressions) after performing a CFA by performing the following steps:
1. Select all the observed variables in the launch window, click Model Variables, and click OK.
2. Using the To List under Model Specification, select the variables that are presumed to load onto a latent variable.
3. Enter the name for the latent variable next in the box below the To List, and click the add latent button.
4. Repeat this process until all the latent variables for the model have been specified.
5. In the Model Specification report, select the predictors in the From List and the corresponding outcomes in the To List, and click the unidirectional arrow button.
• Latent Growth Curve (LGC) models enable you to fit and test alternative latent trajectories to repeated measures data. These models are very similar to random effects models in the mixed models framework. Often, you want to compare a no-growth model with a linear model. In a no-growth model, individuals can vary in their starting point but have flat trajectories. In a linear model, individuals can vary in both their starting point and linear slope over time. If enough data are available, you can also compare these models with a quadratic model where individuals can vary in their starting point and their linear and quadratic rates of change over time. You can fit LGC models using the SEM platform model specification, but the platform streamlines the fitting of LGC models by performing the following steps:
1. Select all the observed variables (repeated measures) in the launch window, click Model Variables, and click OK.
Note: For the Latent Growth model shortcuts to specify the model correctly, the observed variables must be listed in ascending time order and must have equal time intervals.
2. Using the Model Shortcuts option, select the Longitudinal Analysis > Intercept-Only Growth Curve model, and then click Run.
3. Using the Model Shortcuts option, select the Longitudinal Analysis > Linear Latent Growth Curve model, and then click Run.
4. Using the Model Shortcuts option, select the Longitudinal Analysis > Quadratic Latent Growth Curve model, and then click Run.
The Model Comparison table shows the alternative fit indices and the best model can be selected.
• Conditional Latent Growth Curve models can be used after identifying an ideal growth trajectory following the steps above. At this point, predictors of the intercept and change factors can be added to the model. These predictors might prove to be important factors for determining initial scores on a growth process and ensuing changes. To fit a conditional LGC, select all of the observed variables (repeated measures), including the hypothesized predictors of the latent variables, in the launch window. Make sure that the predictors are the last variables in the Model Variables list to facilitate the following steps:
1. Use the Model Shortcuts option to select the appropriate growth trajectory. This option specifies all variables in the LGC model, including the predictors. Thus, you need to exclude the predictors from the growth process and correctly specify them as predictors.
2. Find the predictors in the Loadings list, select all the effects that involve them, and click Remove.
3. Select the predictors in the From List and the Intercept or Slope in the To List.
4. Click the unidirectional arrow to specify the conditional LGC.
Note: If you have more than one predictor, their covariances must be specified by selecting the predictors in the From and To Lists and clicking the bidirectional arrow button.
• Multiple Group Analysis models enable you to specify a grouping variable for any model in the SEM framework. The model is then estimated across groups, which enables you to make inferences about different populations. You can specify a multiple group analysis model using the following steps:
1. Select the observed variables you want to model in the launch window and click Model Variables.
2. Select a categorical grouping variable (often a variable with few levels), click Groups, and click OK.
3. Use the Model Specification report to specify your model of choice. To add regression paths, select predictors in the From List, select the corresponding outcomes in the To List, and then click the unidirectional arrow button. To add covariance paths, follow the same steps but click the bidirectional arrow button instead. To add latent variables, select their indicators in the To List and click the add latent button under the To List.
4. Use the Union tab path diagram to select edges and click Set Equal to apply equality constraints across groups. Group-specific constraints or specification changes can be applied using the group-specific tabs. Paths in the model are freely estimated across groups by default.