File: plausibleValues.Rd

package info (click to toggle)
r-cran-semtools 0.5.7-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 3,204 kB
  • sloc: makefile: 2
file content (196 lines) | stat: -rw-r--r-- 8,393 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/plausibleValues.R
\name{plausibleValues}
\alias{plausibleValues}
\title{Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model}
\usage{
plausibleValues(object, nDraws = 20L, seed = 12345,
  omit.imps = c("no.conv", "no.se"), ...)
}
\arguments{
\item{object}{A fitted model of class \link[lavaan:lavaan-class]{lavaan::lavaan},
\link[blavaan:blavaan-class]{blavaan::blavaan}, or \link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}}

\item{nDraws}{\code{integer} specifying the number of draws, analogous to
the number of imputed data sets. If \code{object} is of class
\link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}, this will be the number of draws taken
\emph{per imputation}.  If \code{object} is of class
\link[blavaan:blavaan-class]{blavaan::blavaan}, \code{nDraws} cannot exceed
\code{blavInspect(object, "niter") * blavInspect(bfitc, "n.chains")}
(number of MCMC samples from the posterior). The drawn samples will be
evenly spaced (after permutation for \code{target="stan"}), using
\code{\link[=ceiling]{ceiling()}} to resolve decimals.}

\item{seed}{\code{integer} passed to \code{\link[=set.seed]{set.seed()}}.}

\item{omit.imps}{\code{character} vector specifying criteria for omitting
imputations when \code{object} is of class \link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}.
Can include any of \code{c("no.conv", "no.se", "no.npd")}.}

\item{...}{Optional arguments to pass to \code{\link[lavaan:lavPredict]{lavaan::lavPredict()}}.
\code{assemble} will be ignored because multiple groups are always
assembled into a single \code{data.frame} per draw. \code{type} will be
ignored because it is set internally to \code{type="lv"}.}
}
\value{
A \code{list} of length \code{nDraws}, each of which is a
\code{data.frame} containing plausible values, which can be treated as
a \code{list} of imputed data sets to be passed to \code{\link[=runMI]{runMI()}}
(see \strong{Examples}). If \code{object} is of class
\link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}, the \code{list} will be of length
\code{nDraws*m}, where \code{m} is the number of imputations.
}
\description{
Draw plausible values of factor scores estimated from a fitted
\code{\link[lavaan:lavaan]{lavaan::lavaan()}} model, then treat them as multiple imputations
of missing data using \code{\link[lavaan.mi:lavaan.mi]{lavaan.mi::lavaan.mi()}}.
}
\details{
Because latent variables are unobserved, they can be considered as missing
data, which can be imputed using Monte Carlo methods.  This may be of
interest to researchers with sample sizes too small to fit their complex
structural models.  Fitting a factor model as a first step,
\code{\link[lavaan:lavPredict]{lavaan::lavPredict()}} provides factor-score estimates, which can
be treated as observed values in a path analysis (Step 2).  However, the
resulting standard errors and test statistics could not be trusted because
the Step-2 analysis would not take into account the uncertainty about the
estimated factor scores.  Using the asymptotic sampling covariance matrix
of the factor scores provided by \code{\link[lavaan:lavPredict]{lavaan::lavPredict()}},
\code{plausibleValues} draws a set of \code{nDraws} imputations from the
sampling distribution of each factor score, returning a list of data sets
that can be treated like multiple imputations of incomplete data.  If the
data were already imputed to handle missing data, \code{plausibleValues}
also accepts an object of class \link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}, and will
draw \code{nDraws} plausible values from each imputation.  Step 2 would
then take into account uncertainty about both missing values and factor
scores.  Bayesian methods can also be used to generate factor scores, as
available with the \pkg{blavaan} package, in which case plausible
values are simply saved parameters from the posterior distribution. See
Asparouhov and Muthen (2010) for further technical details and references.

Each returned \code{data.frame} includes a \code{case.idx} column that
indicates the corresponding rows in the data set to which the model was
originally fitted (unless the user requests only Level-2 variables).  This
can be used to merge the plausible values with the original observed data,
but users should note that including any new variables in a Step-2 model
might not accurately account for their relationship(s) with factor scores
because they were not accounted for in the Step-1 model from which factor
scores were estimated.

If \code{object} is a multilevel \code{lavaan} model, users can request
plausible values for latent variables at particular levels of analysis by
setting the \code{\link[lavaan:lavPredict]{lavaan::lavPredict()}} argument \code{level=1} or
\code{level=2}.  If the \code{level} argument is not passed via \dots,
then both levels are returned in a single merged data set per draw.  For
multilevel models, each returned \code{data.frame} also includes a column
indicating to which cluster each row belongs (unless the user requests only
Level-2 variables).
}
\examples{

## example from ?cfa and ?lavPredict help pages
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
fs1 <- plausibleValues(fit1, nDraws = 3,
                       ## lavPredict() can add only the modeled data
                       append.data = TRUE)
lapply(fs1, head)

\donttest{
## To merge factor scores to original data.frame (not just modeled data)
fs1 <- plausibleValues(fit1, nDraws = 3)
idx <- lavInspect(fit1, "case.idx")      # row index for each case
if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
data(HolzingerSwineford1939)             # copy data to workspace
HolzingerSwineford1939$case.idx <- idx   # add row index as variable
## loop over draws to merge original data with factor scores
for (i in seq_along(fs1)) {
  fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
}
lapply(fs1, head)


## multiple-group analysis, in 2 steps
step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
            group.equal = c("loadings","intercepts"))
PV.list <- plausibleValues(step1)

## subsequent path analysis
path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
if(requireNamespace("lavaan.mi")){
  library(lavaan.mi)
  step2 <- sem.mi(path.model, data = PV.list, group = "school")
  ## test equivalence of both slopes across groups
  lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')
}


## multilevel example from ?Demo.twolevel help page
model <- '
  level: 1
    fw =~ y1 + y2 + y3
    fw ~ x1 + x2 + x3
  level: 2
    fb =~ y1 + y2 + y3
    fb ~ w1 + w2
'
msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
lapply(mlPVs, head, n = 10)
## only Level 1
mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
lapply(mlPV1, head)
## only Level 2
mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
lapply(mlPV2, head)



## example with 20 multiple imputations of missing data:
nPVs <- 5
nImps <- 20

if(requireNamespace("lavaan.mi")){
  data(HS20imps, package = "lavaan.mi")

  ## specify CFA model from lavaan's ?cfa help page
  HS.model <- '
    visual  =~ x1 + x2 + x3
    textual =~ x4 + x5 + x6
    speed   =~ x7 + x8 + x9
  '
  out2 <- cfa.mi(HS.model, data = HS20imps)
  PVs <- plausibleValues(out2, nDraws = nPVs)

  idx <- out2@Data@case.idx # can't use lavInspect() on lavaan.mi
  ## empty list to hold expanded imputations
  impPVs <- list()
  for (m in 1:nImps) {
    HS20imps[[m]]["case.idx"] <- idx
    for (i in 1:nPVs) {
      impPVs[[ nPVs*(m - 1) + i ]] <- merge(HS20imps[[m]],
                                            PVs[[ nPVs*(m - 1) + i ]],
                                            by = "case.idx")
    }
  }
  lapply(impPVs, head)
}

}

}
\references{
Asparouhov, T. & Muthen, B. O. (2010). \emph{Plausible values for latent
variables using M}plus. Technical Report. Retrieved from
www.statmodel.com/download/Plausible.pdf
}
\seealso{
\code{\link[lavaan.mi:lavaan.mi]{lavaan.mi::lavaan.mi()}}, \link[lavaan.mi:lavaan.mi-class]{lavaan.mi::lavaan.mi}
}
\author{
Terrence D. Jorgensen (University of Amsterdam;
\email{TJorgensen314@gmail.com})
}