Background: EFA models using lavaan

Go back

The lavaan¹ syntax since version 0.6.4 allows for exploratory blocks for latent variables. The EFAST package builds on this functionality to combine exploratory latent variable models (such as EFA) with structural parameters (in EFAST the residual covariance structure). Here is how to create a basic 3-factor EFA model in lavaan using the x1 - x9 variables from the famous Holzinger and Swineford data included in the lavaan package:

data("HolzingerSwineford1939")
efa_model <- "
  efa('block1')*F1 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
  efa('block1')*F2 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
  efa('block1')*F3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
"

The efa('block1') modifier indicates that the factors F1, F2, and F3 comprise an exploratory block (with the name block1) and thus that we do not exactly know the loading structure of the variables x1 - x9 on these three factors.

The EFA model can then be estimated as follows, (releasing the residual variances and implying the EFA constraints):

efa_hs39 <- lavaan(
  model          = efa_model, 
  data           = HolzingerSwineford1939,  
  auto.var       = TRUE, 
  auto.efa       = TRUE
)

In the results, among the parameter estimates we will see the pattern of factor loadings resulting from the oblique geomin rotation:

summary(efa_hs39)

## lavaan 0.6-5 ended normally after 53 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         36
##   Number of equality constraints                     3
##   Row rank of the constraints matrix                 3
##                                                       
##   Rotation method                       GEOMIN OBLIQUE
##   Geomin epsilon                                  0.01
##   Rotation algorithm (rstarts)               GPA (100)
##   Standardized metric                            FALSE
##   Row weights                                     None
##                                                       
##   Number of observations                           301
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                22.897
##   Degrees of freedom                                12
##   P-value (Chi-square)                           0.029
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   F1 =~ block1                                        
##     x1                0.715    0.091    7.876    0.000
##     x2                0.588    0.095    6.205    0.000
##     x3                0.787    0.086    9.149    0.000
##     x4                0.041    0.038    1.061    0.289
##     x5               -0.070    0.041   -1.707    0.088
##     x6                0.098    0.056    1.759    0.079
##     x7               -0.096    0.097   -0.992    0.321
##     x8                0.169    0.140    1.203    0.229
##     x9                0.414    0.114    3.636    0.000
##   F2 =~ block1                                        
##     x1                0.209    0.091    2.303    0.021
##     x2                0.042    0.053    0.787    0.432
##     x3               -0.092    0.072   -1.275    0.202
##     x4                0.971    0.061   15.891    0.000
##     x5                1.140    0.067   16.988    0.000
##     x6                0.877    0.061   14.475    0.000
##     x7                0.042    0.051    0.818    0.413
##     x8               -0.042    0.045   -0.939    0.348
##     x9                0.023    0.039    0.604    0.546
##   F3 =~ block1                                        
##     x1                0.023    0.045    0.520    0.603
##     x2               -0.145    0.085   -1.702    0.089
##     x3                0.012    0.050    0.240    0.810
##     x4                0.005    0.041    0.132    0.895
##     x5                0.011    0.034    0.319    0.750
##     x6               -0.013    0.040   -0.330    0.741
##     x7                0.774    0.076   10.214    0.000
##     x8                0.693    0.094    7.376    0.000
##     x9                0.451    0.076    5.909    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .x1                0.696    0.087    8.038    0.000
##    .x2                1.035    0.102   10.151    0.000
##    .x3                0.692    0.097    7.134    0.000
##    .x4                0.377    0.048    7.902    0.000
##    .x5                0.403    0.061    6.590    0.000
##    .x6                0.365    0.042    8.613    0.000
##    .x7                0.594    0.106    5.624    0.000
##    .x8                0.479    0.080    5.958    0.000
##    .x9                0.551    0.060    9.132    0.000
##     F1                1.000                           
##     F2                1.000                           
##     F3                1.000

In this summary, we can see that the geomin rotation has produced the structure we expect for this data: the first factor is most associated with the first three variables, the second with x4 - x6, and the last three variables load mostly on the third factor.

This is also the same structure we get when we run an oblique geomin-rotated exploratory factor analysis using the same data in the psych package (except for the ordering of the factors):

efa_psych <- psych::fa(r = HolzingerSwineford1939[, 7:15], nfactors = 3, rotate = "geominQ")
loadings(efa_psych)

## 
## Loadings:
##    MR1    MR3    MR2   
## x1  0.193  0.595       
## x2         0.507 -0.125
## x3         0.689       
## x4  0.844              
## x5  0.884              
## x6  0.804              
## x7        -0.128  0.733
## x8         0.148  0.682
## x9         0.399  0.451
## 
##                  MR1   MR3   MR2
## SS loadings    2.187 1.294 1.223
## Proportion Var 0.243 0.144 0.136
## Cumulative Var 0.243 0.387 0.523

The EFAST package implements this exact pattern, and allows for extracting the loadings similar to the psych package:

efa_efast <- efast_efa(data = HolzingerSwineford1939[, 7:15], M = 3)
efast_loadings(efa_efast)

## 
## Loadings:
##    F1     F2     F3    
## x1  0.603  0.187       
## x2  0.506        -0.119
## x3  0.690              
## x4         0.838       
## x5         0.886       
## x6         0.805       
## x7 -0.150         0.725
## x8  0.106         0.702
## x9  0.368         0.462
## 
##                   F1    F2    F3
## SS loadings    1.275 2.180 1.248
## Proportion Var 0.142 0.242 0.139
## Cumulative Var 0.142 0.384 0.523

Bonus: a shorter way to write the same syntax would be:

efa('block1')*F1 + 
efa('block1')*F2 + 
efa('block1')*F3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9

which is how the EFAST package works (see Extending EFAST models).

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.↩︎

Background: EFA models using lavaan

Erik-Jan van Kesteren & Rogier Kievit