File: mice.impute.lasso.logreg.Rd

package info (click to toggle)
r-cran-mice 3.17.0-1
links: PTS, VCS
area: main
in suites: sid, trixie
size: 2,380 kB
sloc: cpp: 121; sh: 25; makefile: 2
file content (91 lines) | stat: -rw-r--r-- 3,359 bytes
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mice.impute.lasso.logreg.R
\name{mice.impute.lasso.logreg}
\alias{mice.impute.lasso.logreg}
\alias{lasso.logreg}
\title{Imputation by direct use of lasso logistic regression}
\usage{
mice.impute.lasso.logreg(y, ry, x, wy = NULL, nfolds = 10, ...)
}
\arguments{
\item{y}{Vector to be imputed}

\item{ry}{Logical vector of length \code{length(y)} indicating the
the subset \code{y[ry]} of elements in \code{y} to which the imputation
model is fitted. The \code{ry} generally distinguishes the observed
(\code{TRUE}) and missing values (\code{FALSE}) in \code{y}.}

\item{x}{Numeric design matrix with \code{length(y)} rows with predictors for
\code{y}. Matrix \code{x} may have no missing values.}

\item{wy}{Logical vector of length \code{length(y)}. A \code{TRUE} value
indicates locations in \code{y} for which imputations are created.}

\item{nfolds}{The number of folds for the cross-validation of the lasso penalty.
The default is 10.}

\item{...}{Other named arguments.}
}
\value{
Vector with imputed data, same type as \code{y}, and of length
\code{sum(wy)}
}
\description{
Imputes univariate missing binary data using lasso logistic regression with bootstrap.
}
\details{
The method consists of the following steps:
\enumerate{
\item For a given y variable under imputation, draw a bootstrap version y*
with replacement from the observed cases \code{y[ry]}, and stores in x* the
corresponding values from \code{x[ry, ]}.
\item Fit a regularised (lasso) logistic regression with y* as the outcome,
and x* as predictors.
A vector of regression coefficients bhat is obtained.
All of these coefficients are considered random draws from the imputation model
parameters posterior distribution.
Same of these coefficients will be shrunken to 0.
\item Compute predicted scores for m.d., i.e. logit-1(X bhat)
\item Compare the score to a random (0,1) deviate, and impute.
}
The method is based on the Direct Use of Regularized Regression (DURR) proposed by
Zhao & Long (2016) and Deng et al (2016).
}
\references{
Deng, Y., Chang, C., Ido, M. S., & Long, Q. (2016). Multiple imputation for
general missing data patterns in the presence of high-dimensional data.
Scientific reports, 6(1), 1-10.

Zhao, Y., & Long, Q. (2016). Multiple imputation in the presence of
high-dimensional data. Statistical Methods in Medical Research, 25(5),
2021-2035.
}
\seealso{
Other univariate imputation functions: 
\code{\link{mice.impute.cart}()},
\code{\link{mice.impute.lasso.norm}()},
\code{\link{mice.impute.lasso.select.logreg}()},
\code{\link{mice.impute.lasso.select.norm}()},
\code{\link{mice.impute.lda}()},
\code{\link{mice.impute.logreg}()},
\code{\link{mice.impute.logreg.boot}()},
\code{\link{mice.impute.mean}()},
\code{\link{mice.impute.midastouch}()},
\code{\link{mice.impute.mnar.logreg}()},
\code{\link{mice.impute.mpmm}()},
\code{\link{mice.impute.norm}()},
\code{\link{mice.impute.norm.boot}()},
\code{\link{mice.impute.norm.nob}()},
\code{\link{mice.impute.norm.predict}()},
\code{\link{mice.impute.pmm}()},
\code{\link{mice.impute.polr}()},
\code{\link{mice.impute.polyreg}()},
\code{\link{mice.impute.quadratic}()},
\code{\link{mice.impute.rf}()},
\code{\link{mice.impute.ri}()}
}
\author{
Edoardo Costantini, 2021
}
\concept{univariate imputation functions}
\keyword{datagen}