-
Notifications
You must be signed in to change notification settings - Fork 3
/
08-Structural-Equation-Modeling.Rmd
2867 lines (2247 loc) · 127 KB
/
08-Structural-Equation-Modeling.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Structural Equation Modeling {#sem}
> "All models are wrong, but some are useful."
>
> --- George Box [-@Box1979, p. 202]
## Overview of SEM {#overview-sem}
Structural equation modeling is an advanced modeling approach that allows estimating latent variables to account for [measurement error](#measurementError) and to get purer estimates of constructs.\index{structural equation modeling}\index{measurement error}\index{latent variable}
## Getting Started {#gettingStarted-sem}
### Load Libraries {#loadLibraries-sem}
```{r}
library("petersenlab") #to install: install.packages("remotes"); remotes::install_github("DevPsyLab/petersenlab")
library("lavaan")
library("semTools")
library("semPlot")
library("simsem")
library("snow")
library("mice")
library("quantreg")
library("nonnest2")
library("MOTE")
library("tidyverse")
library("here")
library("tinytex")
```
### Prepare Data {#prepareData-sem}
#### Simulate Data {#simulateData-sem}
For reproducibility, I set the seed below.\index{simulate data}
Using the same seed will yield the same answer every time.
There is nothing special about this particular seed.
The [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] includes a [`complement()` function](https://stats.stackexchange.com/a/313138/20338) (archived at https://perma.cc/S26F-QSW3) that simulates data with a specified correlation in relation to an existing variable.\index{petersenlab package}
`PoliticalDemocracy` refers to the Industrialization and Political Democracy data set from the `lavaan` package [@R-lavaan], and it contains measures of political democracy and industrialization in developing countries.
```{r}
sampleSize <- 300
set.seed(52242)
v1 <- complement(PoliticalDemocracy$y1, .4)
v2 <- complement(PoliticalDemocracy$y1, .4)
v3 <- complement(PoliticalDemocracy$y1, .4)
v4 <- complement(PoliticalDemocracy$y1, .4)
PoliticalDemocracy$v1 <- v1
PoliticalDemocracy$v2 <- v2
PoliticalDemocracy$v3 <- v3
PoliticalDemocracy$v4 <- v4
measure1 <- rnorm(n = sampleSize, mean = 50, sd = 10)
measure2 <- measure1 + rnorm(n = sampleSize, mean = 0, sd = 15)
measure3 <- measure1 + measure2 + rnorm(n = sampleSize, mean = 0, sd = 15)
```
#### Add Missing Data {#addMissingData-sem}
Adding missing data to dataframes helps make examples more realistic to real-life data and helps you get in the habit of programming to account for missing data.
```{r}
measure1[c(5,10)] <- NA
measure2[c(10,15)] <- NA
measure3[c(10)] <- NA
PoliticalDemocracy <-
as.data.frame(lapply(
PoliticalDemocracy,
function(cc) cc[ sample(
c(TRUE, NA),
prob = c(0.9, 0.1),
size = length(cc),
replace = TRUE)]))
```
#### Combine Data into Dataframe {#combineData-sem}
```{r}
mydataSEM <- data.frame(measure1, measure2, measure3)
```
## Types of Models {#modelTypes-sem}
### Path Analysis Model {#pathAnalysis-sem}
To understand structural equation modeling (SEM), it is helpful to first understand *path analysis*.\index{path analysis}
Path analysis is similar to multiple regression.\index{path analysis}\index{multiple regression}
Path analysis allows examining the association between multiple predictor variables (or independent variables) in relation to an outcome variable (or dependent variable).\index{path analysis}
Unlike multiple regression, however, path analysis also allows inclusion of multiple *dependent* variables in the same model.\index{path analysis}\index{multiple regression}
Unlike SEM, path analysis uses only manifest (observed) variables, not latent variables (described next).\index{path analysis}\index{structural equation modeling}\index{latent variable}
SEM is path analysis, but with latent (unobserved) variables.\index{structural equation modeling}\index{latent variable}
That is, a SEM model is a model that includes latent variables in addition to observed variables, where one attempts to model (i.e., explain) the *structure* of associations between variance using a series of equations (hence structural equation modeling).\index{structural equation modeling}\index{latent variable}
### Components of a Structural Equation Model {#semModelComponents}
#### Measurement Model {#measurementModel-sem}
The measurement model is a crucial sub-component of any SEM model.\index{structural equation modeling!measurement model}
A SEM model consists of two components: a measurement model and a structural model.\index{structural equation modeling!measurement model}
The *measurement model* is a [confirmatory factor analysis](#cfa-sem) (CFA) model that identifies how many latent factors are estimated, and which items load onto which latent factor.\index{structural equation modeling!measurement model}\index{factor analysis!confirmatory}\index{latent variable}
The measurement model can also specify correlated residuals.\index{structural equation modeling!residual!correlated}\index{structural equation modeling!measurement model}
Basically, the measurement model specifies your best understanding of the structure of the latent construct(s) given how they were assessed.\index{structural equation modeling!measurement model}\index{construct}\index{latent variable}
Before fitting the structural component of a SEM, it is important to have a well-fitting measurement model for each construct in the model.\index{structural equation modeling!measurement model}\index{construct}
In Section \@ref(measurementModel-sem), I present an example of a measurement model.\index{structural equation modeling!measurement model}
#### Structural Model {#structuralModel-sem}
The *structural component* of a SEM model includes the regression paths that specify the hypothesized causal relations among the latent variables.\index{structural equation modeling!structural model}\index{latent variable}
### Confirmatory Factor Analysis Model {#cfa-sem}
[Confirmatory factor analysis](#cfa) (CFA) is a subset of SEM.
CFA includes the [measurement model](#measurementModel-sem) but not the [structural component](#structuralModel-sem) of the model.\index{factor analysis!confirmatory}\index{structural equation modeling!measurement model}\index{structural equation modeling!structural model}
In Section \@ref(cfaExample-sem), I present an example of a [CFA](#cfa) model.\index{factor analysis!confirmatory}
I discuss [CFA](#cfa) models in greater depth in Chapter \@ref(factor-analysis-PCA).\index{factor analysis!confirmatory}
### Structural Equation Model {#semModel}
SEM is [CFA](#cfa), but it adds regression paths that specify hypothesized causal relations between the latent variables, which is called the [structural component](#structuralModel-sem) of the model.\index{structural equation modeling}\index{structural equation modeling!structural model}\index{factor analysis!confirmatory}\index{latent variable}
The [structural model](#structuralModel-sem) includes the hypothesized causal relations between latent variables.\index{structural equation modeling!structural model}\index{latent variable}
A SEM model includes both the [measurement model](#measurementModel-sem) and the [structural model](#structuralModel-sem) [see Figure \@ref(fig:measurementModelStructuralModel), @Civelek2018].\index{structural equation modeling}\index{structural equation modeling!measurement model}\index{structural equation modeling!structural model}
SEM fits a model to observed data, or the variance-covariance matrix, and evaluates the degree of model misfit.\index{structural equation modeling}
That is, fit indices evaluate how likely it is that a given model gave rise to the observed data.\index{structural equation modeling!fit index}
In Section \@ref(semModelExample-sem), I present an example of a SEM model.\index{structural equation modeling}
(ref:measurementModelStructuralModelCaption) Demarcation Between Measurement Model and Structural Model. (Figure adapted from @Civelek2018, Figure 1, p. 7. Civelek, M. E. (2018). *Essentials of structural equation modeling*. Zea E-Books. [https://doi.org/10.13014/K2SJ1HR5](https://doi.org/10.13014/K2SJ1HR5))
```{r measurementModelStructuralModel, out.width = "100%", fig.align = "center", fig.cap = "(ref:measurementModelStructuralModelCaption)", fig.scap = "Demarcation Between Measurement Model and Structural Model.", echo = FALSE}
knitr::include_graphics("./Images/measurementModelStructuralModel.png")
```
SEM is flexible in allowing you to specify [measurement error](#measurementError) and correlated errors.\index{structural equation modeling}\index{measurement error}\index{structural equation modeling!residual!correlated}
Thus, you do not need the same assumptions as in [classical test theory](#ctt), which assumes that [errors](#measurementError) are [random](#randomError) and uncorrelated.\index{classical test theory}\index{measurement error}\index{measurement error!random error}
But the flexibility of SEM also poses challenges because you must explicitly decide what to include—and not include—in your model.\index{structural equation modeling}
This flexibility can be both a blessing and a curse.\index{structural equation modeling}
If the model fit is unacceptable, you can try fitting a different model to see which fits better.\index{structural equation modeling}\index{structural equation modeling!fit index}
Nevertheless, it is important to use theory as a guide when specifying and comparing competing models, and not just rely solely on model fit comparison.\index{structural equation modeling}\index{structural equation modeling!fit index}\index{theory}\index{empiricism}
For example, the model you fit should depend on how you conceptualize each construct: as [reflective](#reflectiveConstruct) or [formative](#formativeConstruct).\index{construct!reflective}\index{construct!formative}
## Estimating Latent Factors {#formativeReflective-sem}
### Model Identification {#modelIdentification-sem}
#### Types of Model Identification {#modelIdentificationTypes-sem}
There are important practical issues to consider with both [reflective](#reflectiveConstruct) and [formative](#formativeConstruct) models.\index{construct!reflective}\index{construct!formative}
An important practical issue is model identification—adding enough constraints so that there is only one, best answer.\index{structural equation modeling!model identification}
The model is identified when each of the estimated parameters has a unique solution.\index{structural equation modeling!model identification}
For ensuring the model is identifiable, see the criteria for identification of the measurement and structural model [here](https://davidakenny.net/cm/identify_formal.htm) (archived at https://perma.cc/5C9E-LBWM).\index{factor analysis}\index{factor analysis!identifiability}
Degrees of freedom in a SEM model is the number of known values minus the number of estimated parameters.\index{structural equation modeling!degrees of freedom}
The number of known values in a SEM model is the number of variances and covariances in the variance-covariance matrix of the manifest (observed) variables in addition to the number of means (i.e., the number of manifest variables), which can be calculated as: $\frac{m(m + 1)}{2} + m$, where $m = \text{the number of manifest variables}$.\index{structural equation modeling!degrees of freedom}
You can never estimate more parameters than the number of known values.\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
A model with zero degrees of freedom is considered "saturated"—it will have perfect fit because the model estimates as many parameters as there are known values.\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
All things equal (i.e., in terms of model fit with the same number of manifest variables), a model with more degrees of freedom is preferred for its parsimony, because fewer parameters are estimated.\index{structural equation modeling!degrees of freedom}\index{parsimony}
Based on the number of known values compared to the number of estimated parameters, a model can be considered either just identified, under-identified, or over-identified.\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
A *just identified model* is a model in which the number of known values is equal to the number of parameters to be estimated (degrees of freedom = 0).\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
An *under-identified model* is a model in which the number of known values is less than the number of parameters to be estimated (degrees of freedom < 0).\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
An *over-identified model* is a model in which the number of known values is greater than the number of parameters to be estimated (degrees of freedom > 0).\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
As an example, there are 14 known values for a model with 4 manifest variables ($\frac{4(4 + 1)}{2} + 4 = 14$): 4 variances, 6 covariances, and 4 means.\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Here is the variance-covariance matrix:
```{r}
vcovMatrix4measures <- cov(
PoliticalDemocracy[,c("y1","y2","y3","y4")],
use = "pairwise.complete.obs")
vcovMatrix4measures[upper.tri(vcovMatrix4measures)] <- NA
vcovMatrix4measures
```
Here are the variances:
```{r}
variances4measures <- diag(vcovMatrix4measures)
variances4measures
```
Here are the covariances:
```{r}
covariances4measures <- vcovMatrix4measures[lower.tri(vcovMatrix4measures)]
covariances4measures
```
Here are the means:
```{r}
means4Measures <- apply(
PoliticalDemocracy[,c("y1","y2","y3","y4")],
2, mean, na.rm = TRUE)
means4Measures
```
#### Approaches to Model Identification {#modelIdentificationApproaches-sem}
The three most widely used approaches to identifying latent factors are:\index{structural equation modeling!model identification}\index{latent variable}
1. [Marker variable](#markerVariable-sem)\index{structural equation modeling!model identification}
1. [Effects coding](#effectsCoding-sem)\index{structural equation modeling!model identification}
1. [Standardized latent factor](#standardizedLatent-sem)\index{structural equation modeling!model identification}
##### Marker Variable Method {#markerVariable-sem}
In the marker variable method, one of the indicators (i.e., manifest variables) is set to have a loading of 1.\index{structural equation modeling!model identification}
Here are examples of using the marker variable method for identification of a latent variable:\index{structural equation modeling!model identification}
```{r}
markerVariable_syntax <- '
#Factor loadings
latentFactor =~ y1 + y2 + y3 + y4
'
markerVariable_fullSyntax <- '
#Factor loadings
latentFactor =~ 1*y1 + y2 + y3 + y4
#Latent variance
latentFactor ~~ latentFactor
#Estimate residual variances of manifest variables
y1 ~~ y1
y2 ~~ y2
y3 ~~ y3
y4 ~~ y4
#Estimate intercepts of manifest variables
y1 ~ 1
y2 ~ 1
y3 ~ 1
y4 ~ 1
'
markerVariableModelFit <- sem(
markerVariable_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
markerVariableModelFit_full <- lavaan(
markerVariable_fullSyntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
```
```{r markerVariable, out.width = "100%", fig.align = "center", fig.cap = "Identifying a Latent Variable Using the Marker Variable Approach."}
semPaths(
markerVariableModelFit,
what = "est",
layout = "tree2",
edge.label.cex = 0.8)
```
##### Effects Coding Method {#effectsCoding-sem}
In the effects coding method, the average of the factor loadings is set to be 1.\index{structural equation modeling!model identification}
The effects coding method is useful if you are interested in the means or variances of the latent factor, because the metric of the latent factor is on the metric of the indicators.\index{structural equation modeling!model identification}
Here are examples of using the effects coding method for identification of a latent variable:\index{structural equation modeling!model identification}
```{r}
effectsCoding_abbreviatedSyntax <- '
#Factor loadings
latentFactor =~ y1 + y2 + y3 + y4
'
effectsCoding_syntax <- '
#Factor loadings
latentFactor =~ NA*y1 + label1*y1 + label2*y2 + label3*y3 + label4*y4
#Constrain factor loadings
label1 == 4 - label2 - label3 - label4 # 4 = number of indicators
'
effectsCoding_fullSyntax <- '
#Factor loadings
latentFactor =~ label1*y1 + label2*y2 + label3*y3 + label4*y4
#Constrain factor loadings
label1 == 4 - label2 - label3 - label4 # 4 = number of indicators
#Latent variance
latentFactor ~~ latentFactor
#Estimate residual variances of manifest variables
y1 ~~ y1
y2 ~~ y2
y3 ~~ y3
y4 ~~ y4
#Estimate intercepts of manifest variables
y1 ~ 1
y2 ~ 1
y3 ~ 1
y4 ~ 1
'
effectsCodingModelFit_abbreviated <- sem(
effectsCoding_abbreviatedSyntax,
data = PoliticalDemocracy,
effect.coding = "loadings",
missing = "ML",
estimator = "MLR")
effectsCodingModelFit <- sem(
effectsCoding_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
effectsCodingModelFit_full <- lavaan(
effectsCoding_fullSyntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
```
```{r effectsCoding, out.width = "100%", fig.align = "center", fig.cap = "Identifying a Latent Variable Using the Effects Coding Approach."}
semPaths(
effectsCodingModelFit,
what = "est",
layout = "tree2",
edge.label.cex = 0.8)
```
##### Standardized Latent Factor Method {#standardizedLatent-sem}
In the standardized latent factor method, the latent factor is set to have a mean of 0 and a standard deviation of 1.\index{structural equation modeling!model identification}
The standardized latent factor method is a useful approach if you are not interested in the means or variances of the latent factors and want to freely estimate the factor loadings.\index{structural equation modeling!model identification}
Here are examples of using the standardized latent factor method for identification of a latent variable:\index{structural equation modeling!model identification}
```{r}
standardizedLatent_abbreviatedsyntax <- '
#Factor loadings
latentFactor =~ y1 + y2 + y3 + y4
'
standardizedLatent_syntax <- '
#Factor loadings
latentFactor =~ NA*y1 + y2 + y3 + y4
#Latent mean
latentFactor ~ 0
#Latent variance
latentFactor ~~ 1*latentFactor
'
standardizedLatent_fullSyntax <- '
#Factor loadings
latentFactor =~ NA*y1 + y2 + y3 + y4
#Latent mean
latentFactor ~ 0
#Latent variance
latentFactor ~~ 1*latentFactor
#Estimate residual variances of manifest variables
y1 ~~ y1
y2 ~~ y2
y3 ~~ y3
y4 ~~ y4
#Estimate intercepts of manifest variables
y1 ~ 1
y2 ~ 1
y3 ~ 1
y4 ~ 1
'
standardizedLatentFit_abbreviated <- sem(
standardizedLatent_abbreviatedsyntax,
data = PoliticalDemocracy,
std.lv = TRUE,
missing = "ML",
estimator = "MLR")
standardizedLatentFit <- sem(
standardizedLatent_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
standardizedLatentFit_full <- lavaan(
standardizedLatent_fullSyntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
```
```{r standaredizedLatent, out.width = "100%", fig.align = "center", fig.cap = "Identifying a Latent Variable Using the Standardized Latent Factor Approach."}
semPaths(
standardizedLatentFit,
what = "est",
layout = "tree2",
edge.label.cex = 0.8)
```
### Types of Latent Factors {#latentFactorTypes-sem}
#### Reflective Latent Factors {#reflectiveFactors-sem}
For a [reflective model](#reflectiveConstruct) with 4 indicators, we would need to estimate 12 parameters: a factor loading, error term, and intercept for each of the 4 indicators.\index{construct!reflective}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Here are the parameters estimated:
```{r}
reflectiveModel_syntax <- '
#Reflective model factor loadings
reflective =~ y1 + y2 + y3 + y4
'
reflectiveModelFit <- sem(
reflectiveModel_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR",
std.lv = TRUE)
reflectiveModelParameters <- parameterEstimates(
reflectiveModelFit)[!is.na(parameterEstimates(reflectiveModelFit)$z),]
row.names(reflectiveModelParameters) <- NULL
reflectiveModelParameters
```
Here are the degrees of freedom:\index{structural equation modeling!degrees of freedom}
```{r}
fitMeasures(reflectiveModelFit, "df")
```
Here is a model diagram:
```{r reflectiveModelFigure, out.width = "100%", fig.align = "center", fig.cap = "Example of a Reflective Model."}
semPaths(
reflectiveModelFit,
what = "Std.all",
layout = "tree2",
edge.label.cex = 0.8)
```
Thus, for a [reflective model](#reflectiveConstruct), we only have to estimate a small number of parameters to specify what is happening in our model, so the model is parsimonious.\index{construct!reflective}\index{structural equation modeling!degrees of freedom}
With 4 indicators, the number of known values (14) is greater than the number of parameters (12).\index{construct!reflective}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
We have 2 degrees of freedom ($14 - 12 = 2$).\index{construct!reflective}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Because the degrees of freedom is greater than 0, it is easy to identify the model—the model is over-identified.\index{construct!reflective}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
A [reflective model](#reflectiveConstruct) with 3 indicators would have 9 known values ($\frac{3(3 + 1)}{2} + 3 = 9$), 9 parameters (3 factor loadings, 3 error terms, 3 intercepts), and 0 degrees of freedom, and it would be identifiable because it would be just-identified.\index{construct!reflective}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
#### Formative Latent Factors {#formativeFactors-sem}
However, for a [formative model](#formativeConstruct), we must specify more parameters: a factor loading, intercept, and variance for each of the 4 indicators, all 6 permissive correlations, and 1 error term for the latent variable, for a total of 19 parameters.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Here are the parameters estimated:
```{r, error = TRUE}
formativeModel_syntax <- '
#Formative model factor loadings
formative <~ v1 + v2 + v3 + v4
formative ~~ formative
'
formativeModelFit <- sem(
formativeModel_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
formativeModelParameters <- parameterEstimates(formativeModelFit)
formativeModelParameters
```
Here are the degrees of freedom:\index{structural equation modeling!degrees of freedom}
```{r}
PT <- lavaanify(
formativeModel_syntax,
fixed.x = TRUE, # sem() sets fixed.x = TRUE by default
meanstructure = TRUE # estimator = "MLR" and missing = "ML" both set meanstructure = TRUE
)
lav_partable_df(PT)
formativeModelFit
```
Here is a model diagram:
```{r formativeModelUnderidentifiedFigure, out.width = "100%", fig.align = "center", fig.cap = "Example of an Under-Identified Formative Model."}
semPaths(
formativeModelFit,
what = "Std.all",
layout = "tree2",
edge.label.cex = 0.8)
```
For a [formative model](#formativeConstruct) with 4 measures, the number of known values (14) is less than the number of parameters (19).\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
The number of degrees of freedom is negative ($14 - 19 = -5$), thus the model is not able to be identified—the model is under-identified.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Thus, for a [formative model](#formativeConstruct), we need more parameters than we have data—the model is under-identified.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Therefore, to estimate a formative model with 4 indicators, we must add assumptions and other variables that are consequences of the [formative construct](#formativeConstruct).\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
Options for identifying a [formative construct](#formativeConstruct) are described by @Treiblmaier2011.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
See below for an example formative model that is identified because of additional assumptions.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
```{r formativeModelIdentifiedFigure, out.width = "100%", fig.align = "center", fig.cap = "Example of an Identified Formative Model."}
formativeModel2_syntax <- '
#Formative model factor loadings
formative <~ 1*v1 + v2 + v3 + v4
reflective =~ y1 + y2 + y3 + y4
formative ~~ 1*formative
reflective ~ formative
'
formativeModel2Fit <- sem(
formativeModel2_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
formativeModel2Parameters <- parameterEstimates(formativeModel2Fit)
formativeModel2Parameters
fitMeasures(formativeModel2Fit, "df")
semPaths(
formativeModel2Fit,
what = "Std.all",
layout = "tree2",
edge.label.cex = 0.8)
```
Thus, [formative constructs](#formativeConstruct) are challenging to use in a SEM framework.\index{construct!formative}\index{structural equation modeling!model identification}
To estimate a [formative construct](#formativeConstruct) in a SEM framework, the [formative construct](#formativeConstruct) must be used in the context of a model that allows some constraints.\index{construct!formative}\index{structural equation modeling!model identification}\index{structural equation modeling!degrees of freedom}
A [formative](#formativeConstruct) latent factor includes a disturbance term, and is thus not entirely determined by the causal indicators [@Bollen2011].\index{construct!formative}
A composite (such as in [principal component analysis](#pca)), by contrast, has no disturbance term and is therefore completely determined by the composite indicators [@Bollen2011].\index{construct!formative}
Emerging techniques such as confirmatory composite analysis allow estimation of [formative](#formativeConstruct) composites [@Schuberth2023; @Yu2023].\index{construct!formative}\index{confirmatory composite analysis}
Below is an example of confirmatory composite analysis using the Henseler-Ogasawara specification (adapted from: https://confirmatorycompositeanalysis.com/tutorials-lavaan; archived at: https://perma.cc/7LSU-PTZR) [@Schuberth2023]:
```{r}
formativeModel3_syntax <- '
# Specification of the reflective latent factor
reflective =~ y1 + y2 + y3 + y4
# Specification of the associations between the observed variables v1 - v4
# and the emergent variable "formative" in terms of composite loadings.
formative =~ NA*v1 + l11*v1+ l21*v2 + 1*v3 + l41*v4
# Label the variance of the formative composite
formative ~~ varformative*formative
# Specification of the associations between the observed variables v1 - v4
# and their excrescent variables in terms of composite loadings.
nu11 =~ 1*v1 + l22*v2 + l32*v3 + l42*v4
nu12 =~ 0*v1 + 1*v2 + l33*v3 + l43*v4
nu13 =~ 0*v1 + 0*v2 + l34*v3 + 1*v4
# Label the variances of the excrescent variables
nu11 ~~ varnu11*nu11
nu12 ~~ varnu12*nu12
nu13 ~~ varnu13*nu13
# Specify the effect of formative on reflective
reflective ~ formative
# The H-O specification assumes that the excrescent variables are uncorrelated.
# Therefore, the covariance between the excrescent variables is fixed to 0:
nu11 ~~ 0*nu12 + 0*nu13
nu12 ~~ 0*nu13
# Moreover, the H-O specification assumes that the excrescent variables are uncorrelated
# with the emergent and latent variables. Therefore, the covariances between
# the emergent and the excrescent varibales are fixed to 0:
formative ~~ 0*nu11 + 0*nu12 + 0*nu13
reflective =~ 0*nu11 + 0*nu12 + 0*nu13
# In lavaan, the =~ command is originally used to specify a common factor model,
# which assumes that each observed variable is affected by a random measurement error.
# It is assumed that the observed variables forming composites are free from
# random measurement error. Therefore, the variances of the random measurement errors
# originally attached to the observed variables by the common factor model are fixed to 0:
v1 ~~ 0*v1
v2 ~~ 0*v2
v3 ~~ 0*v3
v4 ~~ 0*v4
# Calculate the unstandardized weights to form the formative latent variable
w1 := (-l32 + l22*l33 + l34*l42 - l22*l34*l43)/(1 -
l11*l32 - l21*l33 + l11*l22*l33 - l34*l41 + l11*l34*l42 +
l21* l34* l43 - l11* l22* l34* l43)
w2 := (-l33 + l34*l43)/(1 - l11*l32 - l21*l33 +
l11*l22*l33 - l34*l41 + l11*l34*l42 + l21*l34*l43 - l11*l22*l34*l43)
w3 := 1/(1 - l11*l32 - l21*l33 + l11*l22*l33 -
l34*l41 + l11*l34*l42 + l21*l34*l43 - l11*l22*l34*l43)
w4 := -l34/(1 - l11*l32 - l21*l33 + l11*l22*l33 -
l34*l41 + l11*l34*l42 + l21*l34*l43 - l11*l22*l34*l43)
# Calculate the variances
varv1 := l11^2*varformative + varnu11
varv2 := l21^2*varformative + l22^2*varnu11 + varnu12
varv3 := varformative + l32^2*varnu11 + l33^2*varnu12 + l34^2*varnu13
varv4 := l41^2*varformative + l42^2*varnu11 + l43^2*varnu12 + varnu13
# Calculate the standardized weights to form the formative latent variable
w1std := w1*(varv1/varformative)^(1/2)
w2std := w2*(varv2/varformative)^(1/2)
w3std := w3*(varv3/varformative)^(1/2)
w4std := w4*(varv4/varformative)^(1/2)
'
formativeModel3Fit <- sem(
formativeModel3_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
formativeModel3Parameters <- parameterEstimates(formativeModel3Fit)
formativeModel3Parameters
fitMeasures(formativeModel3Fit, "df")
semPaths(
formativeModel3Fit,
what = "Std.all",
layout = "tree2",
edge.label.cex = 0.8)
```
Below is an example of confirmatory composite analysis using the refined Henseler-Ogasawara specification [@Yu2023]:
```{r}
formativeModel4_syntax <- '
# Specification of the reflective latent factor
reflective =~ y1 + y2 + y3 + y4
# Specification of the associations between the observed variables v1 - v4
# and the emergent variable "formative" in terms of composite loadings.
formative =~ NA*v1 + l11*v1 + l21*v2 + 1*v3 + l41*v4
# Label the variance of the formative composite
formative ~~ varformative*formative
# Specification of the associations between the observed variables v1 - v4
# and their excrescent variables in terms of composite loadings.
nu1 =~ 1*v2 + l12*v1
nu2 =~ 1*v3 + l23*v2
nu3 =~ 1*v4 + l34*v3
# Label the variances of the excrescent variables
nu1 ~~ varnu1*nu1
nu2 ~~ varnu2*nu2
nu3 ~~ varnu3*nu3
# Specify the effect of formative on reflective
reflective ~ formative
# Constrain the covariances between excrescent variables and
# other variables in the structural model to zero. Moreover,
# label the covariances among excrescent variables.
nu1 ~~ 0*formative + 0*reflective + cov12*nu2 + cov13*nu3
nu2 ~~ 0*formative + 0*reflective + cov23*nu3
nu3 ~~ 0*formative + 0*reflective
# Fix the variances of the disturbance terms to zero.
v1 ~~ 0*v1
v2 ~~ 0*v2
v3 ~~ 0*v3
v4 ~~ 0*v4
# Calculate the unstandardized weights to form the formative latent variable
w1 := ((1)*((1)*((1)))) / ((l11)*((1)*((1)*((1)))) + -(l21)*((l12)*((1)*((1)))) + (1)*((l12)*((l23)*((1)))) + -(l41)*((l12)*((l23)*((l34)))))
w2 := -((l12)*((1)*((1)))) / ((l11)*((1)*((1)*((1)))) + -(l21)*((l12)*((1)*((1)))) + (1)*((l12)*((l23)*((1)))) + -(l41)*((l12)*((l23)*((l34)))))
w3 := ((l12)*((l23)*((1)))) / ((l11)*((1)*((1)*((1)))) + -(l21)*((l12)*((1)*((1)))) + (1)*((l12)*((l23)*((1)))) + -(l41)*((l12)*((l23)*((l34)))))
w4 := -((l12)*((l23)*((l34)))) / ((l11)*((1)*((1)*((1)))) + -(l21)*((l12)*((1)*((1)))) + (1)*((l12)*((l23)*((1)))) + -(l41)*((l12)*((l23)*((l34)))))
# Calculate the variances
varv1 := ((l11) * (varformative)) * (l11) + ((l12) * (varnu1)) * (l12)
varv2 := ((l21) * (varformative)) * (l21) + ((1) * (varnu1) + (l23) * (cov12)) * (1) + ((1) * (cov12) + (l23) * (varnu2)) * (l23)
varv3 := ((1) * (varformative)) * (1) + ((1) * (varnu2) + (l34) * (cov23)) * (1) + ((1) * (cov23) + (l34) * (varnu3)) * (l34)
varv4 := ((l41) * (varformative)) * (l41) + ((1) * (varnu3)) * (1)
# Calculate the standardized weights to form the formative latent variable
wstdv1 := ((w1) * (sqrt(varv1))) * (1/sqrt(varformative))
wstdv2 := ((w2) * (sqrt(varv2))) * (1/sqrt(varformative))
wstdv3 := ((w3) * (sqrt(varv3))) * (1/sqrt(varformative))
wstdv4 := ((w4) * (sqrt(varv4))) * (1/sqrt(varformative))
'
formativeModel4Fit <- sem(
formativeModel4_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
formativeModel4Parameters <- parameterEstimates(formativeModel4Fit)
formativeModel4Parameters
fitMeasures(formativeModel4Fit, "df")
semPaths(
formativeModel4Fit,
what = "Std.all",
layout = "tree2",
edge.label.cex = 0.8)
```
You can generate the weights for the indicators (to be used in the model syntax) for the refined Henseler-Ogasawara specification using the following code:
```{r, eval = FALSE}
library(calculus)
# First, construct the loading matrix
loadingMatrix <- matrix(c('l11','l21',1,'l41','l12',1,0,0,0,'l23',1,0,0,0,'l34',1),4,4)
# Check the structure
loadingMatrix
# Invert matrix, the first row contains the (unstandardized) weights
# these can be copy and pasted to the lavaan model to specify the weights as new parameters
mxinv(loadingMatrix)
```
Florian Schubert provides an `R` function to create the full `lavaan` syntax for confirmatory composite analysis at the following link: https://github.com/FloSchuberth/HOspecification
## Additional Types of SEM {#additionalSEMmodels}
Up to this point, we have discussed SEM with dimensional constructs.\index{dimensional}
It also worth knowing about additional types of SEM models, including latent class models and mixture models, that handle categorical constructs.\index{categorical}\index{latent class model}\index{mixture model}
However, most disorders are more accurately conceptualized as dimensional than as categorical [@Markon2011], so just because you can estimate categorical latent factors does not necessarily mean that one should.\index{dimensional}\index{categorical}\index{latent class model}\index{mixture model}
### Latent Class Models {#latentClassModels}
In *latent class models*, the construct is not dimensional, but rather categorical.\index{categorical}\index{latent class model}\index{dimensional}
The categorical constructs are latent classifications and are called latent classes.\index{categorical}\index{latent class model}
For instance, the construct could be a diagnosis that influences scores on the measures.\index{categorical}\index{latent class model}\index{diagnosis}
Latent class models examine qualitative differences in kind, rather than quantitative differences in degree.\index{categorical}\index{latent class model}\index{diagnosis}
### Mixture Models {#MixtureModels}
*Mixture models* allow for a combination of latent categorical constructs (classes) and latent dimensional constructs.\index{dimensional}\index{categorical}\index{mixture model}
That is, it allows for both qualitative and quantitative differences.\index{dimensional}\index{categorical}\index{mixture model}
However, this additional model complexity also necessitates a larger sample size for estimation.\index{mixture model}
SEM generally requires a 3-digit sample size ($N = 100+$), whereas mixture models typically require a 4- or 5-digit sample size ($N = 1,000+$).\index{mixture model}
### Exploratory Structural Equation Models {#esemModels}
We describe exploratory structural equation models in Section \@ref(efa-cfa-esem).\index{structural equation modeling!exploratory}
## Causal Diagrams: Directed Acyclic Graphs {#dag}
A key tool when designing a structural equation model is a conceptual depiction of the hypothesized causal processes.\index{structural equation modeling!causal diagram}
A causal diagram depicts the hypothesized causal processes that link two or more variables.\index{structural equation modeling!causal diagram}
A common form of causal diagrams is the directed acyclic graph (DAG).\index{structural equation modeling!causal diagram}
DAGs provide a helpful tool to communicate about causal questions and help identify how to avoid bias (i.e., over-estimation) in associations between variables due to confounding (i.e., common causes) [@Digitale2022].\index{structural equation modeling!causal diagram}
Free tools to create DAGs include the `R` package `dagitty` [@Textor2017] and the associated browser-based extension, DAGitty\index{structural equation modeling!causal diagram}: https://dagitty.net (archived at https://perma.cc/U9BY-VZE2).
Path analytic diagrams (i.e., causal diagrams with boxes, circles, and lines) are described in Section \@ref(cttOverview) of Chapter \@ref(reliability).
## Model Fit Indices {#modelFitIndices-sem}
Various model fit indices can be used for evaluating how well a model fits the data and for comparing the fit of two competing models.\index{structural equation modeling!fit index}
Fit indices known as absolute fit indices compare whether the model fits better than the best-possible fitting model (i.e., a saturated model).\index{structural equation modeling!fit index}
Examples of absolute fit indices include the chi-square test, root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR).\index{structural equation modeling!fit index}
The chi-square test evaluates whether the model has a significant degree of misfit relative to the best-possible fitting model (a saturated model that fits as many parameters as possible; i.e., as many parameters as there are degrees of freedom); the null hypothesis of a chi-square test is that there is no difference between the predicted data (i.e., the data that would be observed if the model were true) and the observed data.\index{structural equation modeling!fit index}
Thus, a non-significant chi-square test indicates good model fit.\index{structural equation modeling!fit index}
However, because the null hypothesis of the chi-square test is that the model-implied covariance matrix is exactly equal to the observed covariance matrix (i.e., a model of perfect fit), this may be an unrealistic comparison.\index{structural equation modeling!fit index}
Models are simplifications of reality, and our models are virtually never expected to be a perfect description of reality.\index{structural equation modeling!fit index}
Thus, we would say a model is "useful" and partially validated if "it helps us to understand the relation between variables and does a 'reasonable' job of matching the data...A perfect fit may be an inappropriate standard, and a high chi-square estimate may indicate what we already know—that the hypothesized model holds approximately, not perfectly." [@Bollen1989, p. 268].\index{structural equation modeling!fit index}
The power of the chi-square test depends on sample size, and a large sample will likely detect small differences as significantly worse than the best-possible fitting model [@Bollen1989].\index{structural equation modeling!fit index}
RMSEA is an index of absolute fit.\index{structural equation modeling!fit index}
Lower values indicate better fit.\index{structural equation modeling!fit index}
SRMR is an index of absolute fit with no penalty for model complexity.\index{structural equation modeling!fit index}
Lower values indicate better fit.\index{structural equation modeling!fit index}
There are also various fit indices known as incremental, comparative, or relative fit indices that compare whether the model fits better than the worst-possible fitting model (i.e., a "baseline" or "null" model).\index{structural equation modeling!fit index}
Incremental fit indices include a chi-square difference test, the comparative fit index (CFI), and the Tucker-Lewis index (TLI).\index{structural equation modeling!fit index}
Unlike the chi-square test comparing the model to the best-possible fitting model, a significant chi-square test of the relative fit index indicates better fit—i.e., that the model fits better than the worst-possible fitting model.\index{structural equation modeling!fit index}
CFI is another relative fit index that compares the model to the worst-possible fitting model.\index{structural equation modeling!fit index}
Higher values indicate better fit.\index{structural equation modeling!fit index}
TLI is another relative fit index.\index{structural equation modeling!fit index}
Higher values indicate better fit.\index{structural equation modeling!fit index}
Parsimony fit include fit indices that use information criteria fit indices, including the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC).\index{structural equation modeling!fit index}\index{parsimony}
BIC penalizes model complexity more so than AIC.\index{structural equation modeling!fit index}
Lower AIC and BIC values indicate better fit.\index{structural equation modeling!fit index}
Chi-square difference tests and CFI can be used to compare two nested models.\index{structural equation modeling!fit index}
AIC and BIC can be used to compare two non-nested models.\index{structural equation modeling!fit index}
Criteria for acceptable fit and good fit of SEM models are in Table \@ref(tab:semFitIndices).\index{structural equation modeling!fit index}
In addition, dynamic fit indexes have been proposed based on simulation to identify fit index cutoffs that are tailored to the characteristics of the specific model and data [@McNeish2023].\index{structural equation modeling!fit index}
Table: (\#tab:semFitIndices) Criteria for Acceptable and Good Fit of Structural Equation Models Based on Fit Indices.
| SEM Fit Index | Acceptable Fit | Good Fit |
|---------------|----------------|------------|
| RMSEA | $\leq$ .08 | $\leq$ .05 |
| CFI | $\geq$ .90 | $\geq$ .95 |
| TLI | $\geq$ .90 | $\geq$ .95 |
| SRMR | $\leq$ .10 | $\leq$ .08 |
However, good model fit does not necessarily indicate a true model.\index{structural equation modeling!fit index}
In addition to global fit indices, it can also be helpful to examine evidence of local fit, such as the residual covariance matrix.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
The residual covariance matrix represents the difference between the observed covariance matrix and the model-implied covariance matrix (the observed covariance matrix minus the model-implied covariance matrix).\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
These difference values are called *covariance residuals*.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
Standardizing the covariance matrix by converting each to a correlation matrix can be helpful for interpreting the magnitude of any local misfit.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
This is known as a residual correlation matrix, which is composed of *correlation residuals*.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
Correlation residuals greater than |.10| are possible evidence for poor local fit [@Kline2023].\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
If a correlation residual is positive, it suggests that the model underpredicts the observed association between the two variables (i.e., the observed covariance is greater than the model-implied covariance).\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
If a correlation residual is negative, it suggests that the model overpredicts their observed association between the two variables (i.e., the observed covariance is smaller than the model-implied covariance).\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
If the two variables are connected by only indirect pathways, it may be helpful to respecify the model with direct pathways between the two variables, such as a direct effect (i.e., regression path) or a covariance path.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
For guidance on evaluating local fit, see @Kline2024.\index{structural equation modeling!fit index}\index{structural equation modeling!residual}
## Correlation Matrix {#correlationMatrix-sem}
```{r}
cor(mydataSEM, use = "pairwise.complete.obs")
```
Correlation matrices of various types using the `cor.table()` function from the [`petersenlab`](https://github.com/DevPsyLab/petersenlab) package [@R-petersenlab] are in Tables \@ref(tab:corTable1b), \@ref(tab:corTable2b), and \@ref(tab:corTable3b).\index{petersenlab package}\index{correlation}
```{r, eval = FALSE}
cor.table(mydataSEM, dig = 2)
cor.table(mydataSEM, type = "manuscript", dig = 2)
cor.table(mydataSEM, type = "manuscriptBig", dig = 2)
```
```{r, include = FALSE}
corTable1b <- cor.table(
mydataSEM,
dig = 2)
corTable2b <- cor.table(
mydataSEM,
type = "manuscript",
dig = 2)
corTable3b <- cor.table(
mydataSEM,
type = "manuscriptBig",
dig = 2)
```
```{r corTable1b, echo = FALSE}
corTable1b %>%
kable(.,
caption = "Correlation Matrix with *r*, *n*, and *p*-values.",
booktabs = TRUE,
linesep = c("", "", "\\addlinespace"),
escape = FALSE)
```
```{r corTable2b, echo = FALSE}
corTable2b %>%
kable(.,
caption = "Correlation Matrix with Asterisks for Significant Associations.",
booktabs = TRUE,
linesep = "",
escape = FALSE)
```
```{r corTable3b, echo = FALSE}
corTable3b %>%
kable(.,
caption = "Correlation Matrix.",
booktabs = TRUE,
linesep = "")
```
## Measurement Model (of a Given Construct) {#measurementModelExample-sem}
Even though [CFA models](#cfa) are [measurement models](#measurementModel-sem), I provide separate examples of a [measurement model](#measurementModel-sem) and [CFA models](#cfa) in my examples because [CFA](#cfa) is often used to test competing factor structures.\index{structural equation modeling!measurement model}\index{factor analysis!confirmatory}
For instance, you could use [CFA](#cfa) to test whether the variance in several measures' scores is best explained with one factor or two factors.\index{factor analysis!confirmatory}
In the [measurement model](#measurementModel-sem) below, I present a simple one-factor model with three measures.\index{structural equation modeling!measurement model}
The [measurement model](#measurementModel-sem) is what we settle on as the estimation of each construct before we add the [structural component](#structuralModel-sem) to estimate the relations among latent variables.\index{structural equation modeling!measurement model}\index{structural equation modeling!structural model}
Basically, we add the [structural component](#structuralModel-sem) onto the [measurement model](#measurementModel-sem).\index{structural equation modeling!measurement model}\index{structural equation modeling!structural model}
In Section \@ref(cfaExample-sem), I present a [CFA model](#cfa) with multiple latent factors.\index{factor analysis!confirmatory}
The measurement models were fit in the `lavaan` package [@R-lavaan].\index{structural equation modeling!measurement model}
### Specify the Model {#measurementModelSyntax-sem}
```{r}
measurementModel_syntax <- '
#Factor loadings
latentFactor =~ measure1 + measure2 + measure3
'
measurementModel_fullSyntax <- '
#Factor loadings (free the factor loading of the first indicator)
latentFactor =~ NA*measure1 + measure2 + measure3
#Fix latent mean to zero
latentFactor ~ 0
#Fix latent variance to one
latentFactor ~~ 1*latentFactor
#Estimate covariances among latent variables (not applicable because there is only one latent variable)
#Estimate residual variances of manifest variables
measure1 ~~ measure1
measure2 ~~ measure2
measure3 ~~ measure3
#Free intercepts of manifest variables
measure1 ~ int1*1
measure2 ~ int2*1
measure3 ~ int3*1
'
```
#### Summary of Model Features {#measurementModelSummary-sem}
```{r}
summary(measurementModel_syntax)
summary(measurementModel_fullSyntax)
```
#### Model Syntax in Table Form: {#measurementModelTabular-sem}
```{r}
lavaanify(measurementModel_syntax)
lavaanify(measurementModel_fullSyntax)
```
### Fit the Model {#measurementModelFit-sem}
```{r}
measurementModelFit <- cfa(
measurementModel_syntax,
data = mydataSEM,
missing = "ML",
estimator = "MLR",
std.lv = TRUE)
measurementModelFit_full <- lavaan(
measurementModel_fullSyntax,
data = mydataSEM,
missing = "ML",
estimator = "MLR")
```
### Display Summary Output {#measurementModelOutput-sem}
```{r, include = FALSE}
measurementModelParameters <- parameterEstimates(
measurementModelFit,
standardized = TRUE)
measurementModelParameters_beta1 <- measurementModelParameters[
which(
measurementModelParameters$lhs == "latentFactor" &
measurementModelParameters$rhs == "measure1"),
"std.all"]
measurementModelParameters_beta2 <- measurementModelParameters[
which(