1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
  
     | 
    
      % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cAIC.R
\name{cAIC}
\alias{cAIC}
\title{Conditional Akaike Information for 'lme4' and 'lme'}
\usage{
cAIC(object, method = NULL, B = NULL, sigma.penalty = 1, analytic = TRUE)
}
\arguments{
\item{object}{An object of class merMod either fitted by
\code{\link[lme4]{lmer}} or \code{\link[lme4]{glmer}} of the lme4-package
or an \code{\link[nlme]{lme}} object fro the nlme-package.
Also objects returned form a \code{\link[gamm4]{gamm4}} call are possible.}
\item{method}{Either \code{"conditionalBootstrap"} for the estimation of the
degrees of freedom with the help of conditional Bootstrap or
\code{"steinian"} for analytical representations based on Stein type
formulas. The default is \code{NULL}. In this case the method is choosen
automatically based on the \code{family} argument of the
\code{(g)lmer}-object. For \code{"gaussian"} and \code{"poisson"} this is
the Steinian type estimator, for all others it is the conditional Bootstrap.
For models from the nlme package, only \code{\link[nlme]{lme}} objects, i.e.,
with gaussian response are supported.}
\item{B}{Number of Bootstrap replications. The default is \code{NULL}. Then
B is the minimum of 100 and the length of the response vector.}
\item{sigma.penalty}{An integer value for additional penalization in the analytic 
Gaussian calculation to account for estimated variance components in the residual (co-)variance. 
Per default \code{sigma.penalty} is equal \code{1}, corresponding to a diagonal error 
covariance matrix with only one estimated parameter (sigma). If 
all variance components are known, the value should be set to \code{0}. 
For individual weights (individual variances), this value should be set
to the number of estimated weights. For \code{\link[nlme]{lme}} objects
the penalty term is automatically set by extracting the number of estimated
variance components.}
\item{analytic}{FALSE if the numeric hessian of the (restricted) marginal
log-likelihood from the lmer optimization procedure should be used.
Otherwise (default) TRUE, i.e.  use a analytical version that has to be
computed. Only used for the analytical version of Gaussian responses.}
}
\value{
A \code{cAIC} object, which is a list consisting of: 
1. the conditional log likelihood, i.e. the log likelihood with the random 
effects as penalized parameters; 2. the estimated degrees of freedom; 
3. a list element that is either \code{NULL}
if no new model was fitted otherwise the new (reduced) model, see details;
4. a boolean variable indicating whether a new model was fitted or not; 5.
the estimator of the conditional Akaike information, i.e. minus twice the
log likelihood plus twice the degrees of freedom.
}
\description{
Estimates the conditional Akaike information for models that were fitted in
'lme4' or with 'lme'. Currently all distributions are supported for 'lme4' models, 
based on parametric conditional bootstrap. 
For the Gaussian distribution (from a \code{\link[lme4]{lmer}} or \code{\link[nlme]{lme}}
call) and the Poisson distribution analytical estimators for the degrees of
freedom are available, based on Stein type formulas. Also the conditional
Akaike information for generalized additive models based on a fit via the
'gamm4' or \code{\link[mgcv]{gamm}} calls from the 'mgcv' package can be estimated.
A hands-on tutorial for the package can be found at \url{https://arxiv.org/abs/1803.05664}.
}
\details{
For \code{method = "steinian"} and an object of class \code{merMod} computed
the analytic representation of the corrected conditional AIC in Greven and
Kneib (2010). This is based on a the Stein formula and uses implicit
differentiation to calculate the derivative of the random effects covariance
parameters w.r.t.  the data. The code is adapted form the one provided in
the supplementary material of the paper by Greven and Kneib (2010). The
supplied \code{\link[lme4]{merMod}} model needs to be checked if a random
effects covariance parameter has an optimum on the boundary, i.e. is zero.
And if so the model needs to be refitted with the according random effect
terms omitted. This is also done by the function and the refitted model is
also returned. Notice that the \code{boundary.tol} argument in
\code{\link[lme4]{lmerControl}} has an impact on whether a parameter is
estimated to lie on the boundary of the parameter space. For estimated error
variance the degrees of freedom are increased by one per default. 
\code{sigma.penalty} can be set manually for \code{\link[lme4]{merMod}} models 
if no (0) or more than one variance component (>1) has been estimated. For 
\code{\link[nlme]{lme}} objects this value is automatically defined.
If the object is of class \code{\link[lme4]{merMod}} and has \code{family =
"poisson"} there is also an analytic representation of the conditional AIC
based on the Chen-Stein formula, see for instance Saefken et. al (2014). For
the calculation the model needs to be refitted for each observed response
variable minus the number of response variables that are exactly zero. The
calculation therefore takes longer then for models with Gaussian responses.
Due to the speed and stability of 'lme4' this is still possible, also for
larger datasets.
If the model has Bernoulli distributed responses and \code{method =
"steinian"}, \code{\link{cAIC}} calculates the degrees of freedom based on a
proposed estimator by Efron (2004). This estimator is asymptotically
unbiased if the estimated conditional mean is consistent. The calculation
needs as many model refits as there are data points.
Another more general method for the estimation of the degrees of freedom is
the conditional bootstrap. This is proposed in Efron (2004). For the B
boostrap samples the degrees of freedom are estimated by \deqn{\frac{1}{B -
1}\sum_{i=1}^n\theta_i(z_i)(z_i-\bar{z}),} where \eqn{\theta_i(z_i)} is the
i-th element of the estimated natural parameter.
For models with no random effects, i.e. (g)lms, the \code{\link{cAIC}}
function returns the AIC of the model with scale parameter estimated by REML.
}
\section{WARNINGS }{
 Currently the cAIC can only be estimated for
\code{family} equal to \code{"gaussian"}, \code{"poisson"} and
\code{"binomial"}. Neither negative binomial nor gamma distributed responses
are available. 
Weighted Gaussian models are not yet implemented.
}
\examples{
### Three application examples
b <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
cAIC(b)
b2 <- lmer(Reaction ~ (1 | Days) + (1 | Subject), sleepstudy)
cAIC(b2)
b2ML <- lmer(Reaction ~ (1 + Days | Subject), sleepstudy, REML = FALSE)
cAIC(b2ML)
### Demonstration of boundary case
\dontrun{
set.seed(2017-1-1)
n <- 50
beta <- 2
x <- rnorm(n)
eta <- x*beta
id <- gl(5,10)
epsvar <- 1
data <- data.frame(x = x, id = id)
y_wo_bi <- eta + rnorm(n, 0, sd = epsvar) 
# use a very small RE variance
ranvar <- 0.05
nrExperiments <- 100
sim <- sapply(1:nrExperiments, function(j){
b_i <- scale(rnorm(5, 0, ranvar), scale = FALSE)
y <- y_wo_bi + model.matrix(~ -1 + id) \%*\% b_i
data$y <- y
mixedmod <- lmer(y ~ x + (1 | id), data = data)
linmod <- lm(y ~ x, data = data)
c(cAIC(mixedmod)$caic, cAIC(linmod)$caic)
})
rownames(sim) <- c("mixed model", "linear model")
boxplot(t(sim))
}
}
\references{
Saefken, B., Ruegamer, D., Kneib, T. and Greven, S. (2021):
Conditional Model Selection in Mixed-Effects Models with cAIC4.
<doi:10.18637/jss.v099.i08>
Saefken, B., Kneib T., van Waveren C.-S. and Greven, S. (2014) A
unifying approach to the estimation of the conditional Akaike information in
generalized linear mixed models. Electronic Journal Statistics Vol. 8,
201-225.
Greven, S. and Kneib T. (2010) On the behaviour of marginal and conditional
AIC in linear mixed models. Biometrika 97(4), 773-789.
Efron , B. (2004) The estimation of prediction error. J. Amer. Statist. Ass.
99(467), 619-632.
}
\seealso{
\code{\link[lme4]{lme4-package}}, \code{\link[lme4]{lmer}},
\code{\link[lme4]{glmer}}
}
\author{
Benjamin Saefken, David Ruegamer
}
\keyword{regression}
 
     |