1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pool_parameters.R
\name{pool_parameters}
\alias{pool_parameters}
\title{Pool Model Parameters}
\usage{
pool_parameters(
x,
exponentiate = FALSE,
effects = "fixed",
component = "all",
verbose = TRUE,
...
)
}
\arguments{
\item{x}{A list of \code{parameters_model} objects, as returned by
\code{\link[=model_parameters]{model_parameters()}}, or a list of model-objects that is supported by
\code{model_parameters()}.}
\item{exponentiate}{Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use \code{exponentiate = TRUE} for models
with log-transformed response values. For models with a log-transformed
response variable, when \code{exponentiate = TRUE}, a one-unit increase in the
predictor is associated with multiplying the outcome by that predictor's
coefficient. \strong{Note:} Delta-method standard errors are also computed (by
multiplying the standard errors by the transformed coefficients). This is
to mimic behaviour of other software packages, such as Stata, but these
standard errors poorly estimate uncertainty for the transformed
coefficient. The transformed confidence interval more clearly captures this
uncertainty. For \code{compare_parameters()}, \code{exponentiate = "nongaussian"}
will only exponentiate coefficients from non-Gaussian families.}
\item{effects}{Should parameters for fixed effects (\code{"fixed"}), random
effects (\code{"random"}), both fixed and random effects (\code{"all"}), or the
overall (sum of fixed and random) effects (\code{"random_total"}) be returned?
Only applies to mixed models. May be abbreviated. If the calculation of
random effects parameters takes too long, you may use \code{effects = "fixed"}.}
\item{component}{Which type of parameters to return, such as parameters for the
conditional model, the zero-inflation part of the model, the dispersion
term, or other auxiliary parameters be returned? Applies to models with
zero-inflation and/or dispersion formula, or if parameters such as \code{sigma}
should be included. May be abbreviated. Note that the \emph{conditional}
component is also called \emph{count} or \emph{mean} component, depending on the
model. There are three convenient shortcuts: \code{component = "all"} returns
all possible parameters. If \code{component = "location"}, location parameters
such as \code{conditional}, \code{zero_inflated}, or \code{smooth_terms}, are returned
(everything that are fixed or random effects - depending on the \code{effects}
argument - but no auxiliary parameters). For \code{component = "distributional"}
(or \code{"auxiliary"}), components like \code{sigma}, \code{dispersion}, or \code{beta}
(and other auxiliary parameters) are returned.}
\item{verbose}{Toggle warnings and messages.}
\item{...}{Arguments passed down to \code{model_parameters()}, if \code{x} is a list
of model-objects. Can be used, for instance, to specify arguments like
\code{ci} or \code{ci_method} etc.}
}
\value{
A data frame of indices related to the model's parameters.
}
\description{
This function "pools" (i.e. combines) model parameters in a similar fashion
as \code{mice::pool()}. However, this function pools parameters from
\code{parameters_model} objects, as returned by
\code{\link[=model_parameters]{model_parameters()}}.
}
\details{
Averaging of parameters follows Rubin's rules (\emph{Rubin, 1987, p. 76}).
The pooled degrees of freedom is based on the Barnard-Rubin adjustment for
small samples (\emph{Barnard and Rubin, 1999}).
}
\note{
Models with multiple components, (for instance, models with zero-inflation,
where predictors appear in the count and zero-inflation part, or models with
dispersion component) may fail in rare situations. In this case, compute
the pooled parameters for components separately, using the \code{component}
argument.
Some model objects do not return standard errors (e.g. objects of class
\code{htest}). For these models, no pooled confidence intervals nor p-values
are returned.
}
\examples{
\dontshow{if (require("mice") && require("datawizard")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
# example for multiple imputed datasets
data("nhanes2", package = "mice")
imp <- mice::mice(nhanes2, printFlag = FALSE)
models <- lapply(1:5, function(i) {
lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
})
pool_parameters(models)
# should be identical to:
m <- with(data = imp, exp = lm(bmi ~ age + hyp + chl))
summary(mice::pool(m))
# For glm, mice used residual df, while `pool_parameters()` uses `Inf`
nhanes2$hyp <- datawizard::slide(as.numeric(nhanes2$hyp))
imp <- mice::mice(nhanes2, printFlag = FALSE)
models <- lapply(1:5, function(i) {
glm(hyp ~ age + chl, family = binomial, data = mice::complete(imp, action = i))
})
m <- with(data = imp, exp = glm(hyp ~ age + chl, family = binomial))
# residual df
summary(mice::pool(m))$df
# df = Inf
pool_parameters(models)$df_error
# use residual df instead
pool_parameters(models, ci_method = "residual")$df_error
\dontshow{\}) # examplesIf}
}
\references{
Barnard, J. and Rubin, D.B. (1999). Small sample degrees of freedom with
multiple imputation. Biometrika, 86, 948-955. Rubin, D.B. (1987). Multiple
Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.
}
|