1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mice.impute.lasso.select.logreg.R
\name{mice.impute.lasso.select.logreg}
\alias{mice.impute.lasso.select.logreg}
\alias{lasso.select.logreg}
\title{Imputation by indirect use of lasso logistic regression}
\usage{
mice.impute.lasso.select.logreg(y, ry, x, wy = NULL, nfolds = 10, ...)
}
\arguments{
\item{y}{Vector to be imputed}
\item{ry}{Logical vector of length \code{length(y)} indicating the
the subset \code{y[ry]} of elements in \code{y} to which the imputation
model is fitted. The \code{ry} generally distinguishes the observed
(\code{TRUE}) and missing values (\code{FALSE}) in \code{y}.}
\item{x}{Numeric design matrix with \code{length(y)} rows with predictors for
\code{y}. Matrix \code{x} may have no missing values.}
\item{wy}{Logical vector of length \code{length(y)}. A \code{TRUE} value
indicates locations in \code{y} for which imputations are created.}
\item{nfolds}{The number of folds for the cross-validation of the lasso penalty.
The default is 10.}
\item{...}{Other named arguments.}
}
\value{
Vector with imputed data, same type as \code{y}, and of length
\code{sum(wy)}
}
\description{
Imputes univariate missing data using logistic regression following a
preprocessing lasso variable selection step.
}
\details{
The method consists of the following steps:
\enumerate{
\item For a given \code{y} variable under imputation, fit a linear regression with lasso
penalty using \code{y[ry]} as dependent variable and \code{x[ry, ]} as predictors.
The coefficients that are not shrunk to 0 define the active set of predictors
that will be used for imputation.
\item Fit a logit with the active set of predictors, and find (bhat, V(bhat))
\item Draw BETA from N(bhat, V(bhat))
\item Compute predicted scores for m.d., i.e. logit-1(X BETA)
\item Compare the score to a random (0,1) deviate, and impute.
}
The user can specify a \code{predictorMatrix} in the \code{mice} call
to define which predictors are provided to this univariate imputation method.
The lasso regularization will select, among the variables indicated by
the user, the ones that are important for imputation at any given iteration.
Therefore, users may force the exclusion of a predictor from a given
imputation model by speficing a \code{0} entry.
However, a non-zero entry does not guarantee the variable will be used,
as this decision is ultimately made by the lasso variable selection
procedure.
The method is based on the Indirect Use of Regularized Regression (IURR) proposed by
Zhao & Long (2016) and Deng et al (2016).
}
\references{
Deng, Y., Chang, C., Ido, M. S., & Long, Q. (2016). Multiple imputation for
general missing data patterns in the presence of high-dimensional data.
Scientific reports, 6(1), 1-10.
Zhao, Y., & Long, Q. (2016). Multiple imputation in the presence of
high-dimensional data. Statistical Methods in Medical Research, 25(5),
2021-2035.
}
\seealso{
Other univariate imputation functions:
\code{\link{mice.impute.cart}()},
\code{\link{mice.impute.lasso.logreg}()},
\code{\link{mice.impute.lasso.norm}()},
\code{\link{mice.impute.lasso.select.norm}()},
\code{\link{mice.impute.lda}()},
\code{\link{mice.impute.logreg}()},
\code{\link{mice.impute.logreg.boot}()},
\code{\link{mice.impute.mean}()},
\code{\link{mice.impute.midastouch}()},
\code{\link{mice.impute.mnar.logreg}()},
\code{\link{mice.impute.mpmm}()},
\code{\link{mice.impute.norm}()},
\code{\link{mice.impute.norm.boot}()},
\code{\link{mice.impute.norm.nob}()},
\code{\link{mice.impute.norm.predict}()},
\code{\link{mice.impute.pmm}()},
\code{\link{mice.impute.polr}()},
\code{\link{mice.impute.polyreg}()},
\code{\link{mice.impute.quadratic}()},
\code{\link{mice.impute.rf}()},
\code{\link{mice.impute.ri}()}
}
\author{
Edoardo Costantini, 2021
}
\concept{univariate imputation functions}
\keyword{datagen}
|