File: brglmFit.Rd

package info (click to toggle)
r-cran-brglm2 0.9.2%2Bdfsg-1
links: PTS, VCS
area: main
in suites: forky, sid, trixie
size: 872 kB
sloc: ansic: 52; makefile: 5
file content (311 lines) | stat: -rw-r--r-- 13,969 bytes
parent folder | download | duplicates (2)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/brglmFit.R, R/zzz_conventions.R
\name{brglmFit}
\alias{brglmFit}
\alias{brglm_fit}
\title{Fitting function for \code{\link[=glm]{glm()}} for reduced-bias estimation and
inference}
\usage{
brglmFit(
  x,
  y,
  weights = rep(1, nobs),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep(0, nobs),
  family = gaussian(),
  control = list(),
  intercept = TRUE,
  fixed_totals = NULL,
  singular.ok = TRUE
)

brglm_fit(
  x,
  y,
  weights = rep(1, nobs),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep(0, nobs),
  family = gaussian(),
  control = list(),
  intercept = TRUE,
  fixed_totals = NULL,
  singular.ok = TRUE
)
}
\arguments{
\item{x}{a design matrix of dimension \code{n * p}.}

\item{y}{a vector of observations of length \code{n}.}

\item{weights}{an optional vector of \sQuote{prior weights} to be used
    in the fitting process.  Should be \code{NULL} or a numeric vector.}

\item{start}{starting values for the parameters in the linear
predictor. If \code{NULL} (default) then the maximum likelihood
estimates are calculated and used as starting values.}

\item{etastart}{applied only when start is not \code{NULL}. Starting
values for the linear predictor to be passed to
\code{\link[=glm.fit]{glm.fit()}} when computing starting values using maximum
likelihood.}

\item{mustart}{applied only when start is not \code{NULL}. Starting
values for the vector of means to be passed to
\code{\link[=glm.fit]{glm.fit()}} when computing starting values using maximum
likelihood.}

\item{offset}{this can be used to specify an \emph{a priori} known
    component to be included in the linear predictor during fitting.
    This should be \code{NULL} or a numeric vector of length equal to
    the number of cases.  One or more \code{\link[stats]{offset}} terms can be
    included in the formula instead or as well, and if more than one is
    specified their sum is used.  See \code{\link[stats]{model.offset}}.}

\item{family}{a description of the error distribution and link
    function to be used in the model.  For \code{glm} this can be a
    character string naming a family function, a family function or the
    result of a call to a family function.  For \code{glm.fit} only the
    third option is supported.  (See \code{\link[stats]{family}} for details of
    family functions.)}

\item{control}{a list of parameters controlling the fitting
process. See \code{\link[=brglmControl]{brglmControl()}} for details.}

\item{intercept}{logical. Should an intercept be included in the
    \emph{null} model?}

\item{fixed_totals}{effective only when \code{family} is
\code{\link[=poisson]{poisson()}}. Either \code{NULL} (no effect) or a vector that
indicates which counts must be treated as a group. See Details
for more information and \code{\link[=brmultinom]{brmultinom()}}.}

\item{singular.ok}{logical. If \code{FALSE}, a singular model is an
error.}
}
\description{
\code{\link[=brglmFit]{brglmFit()}} is a fitting method for \code{\link[=glm]{glm()}} that fits generalized
linear models using implicit and explicit bias reduction methods
(Kosmidis, 2014), and other penalized maximum likelihood
methods. Currently supported methods include the mean bias-reducing
adjusted scores approach in Firth (1993) and Kosmidis & Firth
(2009), the median bias-reduction adjusted scores approach in Kenne
Pagui et al. (2017), the correction of the asymptotic bias in
Cordeiro & McCullagh (1991), the mixed bias-reduction adjusted
scores approach in Kosmidis et al (2020), maximum penalized
likelihood with powers of the Jeffreys prior as penalty, and
maximum likelihood. Estimation is performed using a quasi Fisher
scoring iteration (see \code{vignette("iteration", "brglm2")}, which, in
the case of mean-bias reduction, resembles an iterative correction
of the asymptotic bias of the Fisher scoring iterates.
}
\details{
A detailed description of the supported adjustments and the quasi
Fisher scoring iteration is given in the iteration vignette (see,
\code{vignette("iteration", "brglm2")} or Kosmidis et al, 2020).  A
shorter description of the quasi Fisher scoring iteration is also
given in one of the vignettes of the \emph{enrichwith} R package (see,
\url{https://cran.r-project.org/package=enrichwith/vignettes/bias.html}).
Kosmidis and Firth (2010) describe a parallel quasi Newton-Raphson
iteration with the same stationary point.

In the special case of generalized linear models for binomial,
Poisson and multinomial responses, the adjusted score equation
approaches for \code{type = "AS_mixed"}, \code{type = "AS_mean"}, and \code{type = "AS_median"} (see below for what methods each \code{type} corresponds)
return estimates with improved frequentist properties, that are
also always finite, even in cases where the maximum likelihood
estimates are infinite (e.g. complete and quasi-complete separation
in multinomial regression). See, Kosmidis and Firth (2021) for a
proof for binomial-response GLMs with Jeffreys-prior penalties to
the log-likelihood, which is equivalent to mean bias reduction for
logistic regression. See, also,
\code{\link[detectseparation:detect_separation]{detectseparation::detect_separation()}} and
\code{\link[detectseparation:check_infinite_estimates]{detectseparation::check_infinite_estimates()}} for pre-fit and
post-fit methods for the detection of infinite estimates in
binomial response generalized linear models.

The type of score adjustment to be used is specified through the
\code{type} argument (see \code{\link[=brglmControl]{brglmControl()}} for details). The available
options are
\itemize{
\item \code{type = "AS_mixed"}: the mixed bias-reducing score adjustments in
Kosmidis et al (2020) that result in mean bias reduction for the
regression parameters and median bias reduction for the dispersion
parameter, if any; default.
\item \code{type = "AS_mean"}: the mean bias-reducing score adjustments in
Firth, 1993 and Kosmidis & Firth, 2009. \code{type = "AS_mixed"} and
\code{type = "AS_mean"} will return the same results when \code{family} is
\code{\link[=binomial]{binomial()}} or \code{\link[=poisson]{poisson()}}, i.e. when the dispersion is fixed
\item \code{type = "AS_median"}: the median bias-reducing score
adjustments in Kenne Pagui et al. (2017)
\item \code{type = "MPL_Jeffreys"}: maximum penalized likelihood
with powers of the Jeffreys prior as penalty.
\item \code{type = "ML"}: maximum likelihood.
\item \code{type = "correction"}: asymptotic bias correction, as in
Cordeiro & McCullagh (1991).
}

The null deviance is evaluated based on the fitted values using the
method specified by the \code{type} argument (see \code{\link[=brglmControl]{brglmControl()}}).

The \code{family} argument of the current version of \code{\link[=brglmFit]{brglmFit()}} can
accept any combination of \code{\link[=family]{"family"}} objects and link functions,
including families with user-specified link functions, \code{\link[=mis]{mis()}}
links, and \code{\link[=power]{power()}} links, but excluding \code{\link[=quasi]{quasi()}},
\code{\link[=quasipoisson]{quasipoisson()}} and \code{\link[=quasibinomial]{quasibinomial()}} families.

The description of \code{method} argument and the \verb{Fitting functions}
section in \code{\link[=glm]{glm()}} gives information on supplying fitting
methods to \code{\link[=glm]{glm()}}.

\code{fixed_totals} specifies groups of observations for which the sum
of the means of a Poisson model will be held fixed to the observed
count for each group. This argument is used internally in
\code{\link[=brmultinom]{brmultinom()}} and \code{\link[=bracl]{bracl()}} for baseline-category logit models and
adjacent category logit models, respectively.

\code{\link[=brglm_fit]{brglm_fit()}} is an alias to \code{\link[=brglmFit]{brglmFit()}}.
}
\examples{
## The lizards example from ?brglm::brglm
data("lizards")
# Fit the model using maximum likelihood
lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter +
                 light + time, family = binomial(logit), data = lizards,
                 method = "glm.fit")
# Mean bias-reduced fit:
lizardsBR_mean <- glm(cbind(grahami, opalinus) ~ height + diameter +
                      light + time, family = binomial(logit), data = lizards,
                      method = "brglmFit")
# Median bias-reduced fit:
lizardsBR_median <- glm(cbind(grahami, opalinus) ~ height + diameter +
                        light + time, family = binomial(logit), data = lizards,
                        method = "brglmFit", type = "AS_median")
summary(lizardsML)
summary(lizardsBR_median)
summary(lizardsBR_mean)

# Maximum penalized likelihood with Jeffreys prior penatly
lizards_Jeffreys <- glm(cbind(grahami, opalinus) ~ height + diameter +
                        light + time, family = binomial(logit), data = lizards,
                        method = "brglmFit", type = "MPL_Jeffreys")
# lizards_Jeffreys is the same fit as lizardsBR_mean (see Firth, 1993)
all.equal(coef(lizardsBR_mean), coef(lizards_Jeffreys))

# Maximum penalized likelihood with powers of the Jeffreys prior as
# penalty. See Kosmidis & Firth (2021) for the finiteness and
# shrinkage properties of the maximum penalized likelihood
# estimators in binomial response models
\donttest{
a <- seq(0, 20, 0.5)
coefs <- sapply(a, function(a) {
      out <- glm(cbind(grahami, opalinus) ~ height + diameter +
             light + time, family = binomial(logit), data = lizards,
             method = "brglmFit", type = "MPL_Jeffreys", a = a)
      coef(out)
})
# Illustration of shrinkage as a grows
matplot(a, t(coefs), type = "l", col = 1, lty = 1)
abline(0, 0, col = "grey")
}

\donttest{
## Another example from
## King, Gary, James E. Alt, Nancy Elizabeth Burns and Michael Laver
## (1990).  "A Unified Model of Cabinet Dissolution in Parliamentary
## Democracies", _American Journal of Political Science_, **34**, 846-870

data("coalition", package = "brglm2")
# The maximum likelihood fit with log link
coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition)
# The mean bias-reduced fit
coalitionBR_mean <- update(coalitionML, method = "brglmFit")
# The bias-corrected fit
coalitionBC <- update(coalitionML, method = "brglmFit", type = "correction")
# The median bias-corrected fit
coalitionBR_median <- update(coalitionML, method = "brglmFit", type = "AS_median")
}

\donttest{
## An example with offsets from Venables & Ripley (2002, p.189)
data("anorexia", package = "MASS")

anorexML <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
                family = gaussian, data = anorexia)
anorexBC <- update(anorexML, method = "brglmFit", type = "correction")
anorexBR_mean <- update(anorexML, method = "brglmFit")
anorexBR_median <- update(anorexML, method = "brglmFit", type = "AS_median")

# All methods return the same estimates for the regression
# parameters because the maximum likelihood estimator is normally
# distributed around the `true` value under the model (hence, both
# mean and component-wise median unbiased). The Wald tests for
# anorexBC and anorexBR_mean differ from anorexML because the
# bias-reduced estimator of the dispersion is the unbiased, by
# degree of freedom adjustment (divide by n - p), estimator of the
# residual variance. The Wald tests from anorexBR_median are based
# on the median bias-reduced estimator of the dispersion that
# results from a different adjustment of the degrees of freedom
# (divide by n - p - 2/3)
summary(anorexML)
summary(anorexBC)
summary(anorexBR_mean)
summary(anorexBR_median)
}

## endometrial data from Heinze & Schemper (2002) (see ?endometrial)
data("endometrial", package = "brglm2")
endometrialML <- glm(HG ~ NV + PI + EH, data = endometrial,
                     family = binomial("probit"))
endometrialBR_mean <- update(endometrialML, method = "brglmFit",
                             type = "AS_mean")
endometrialBC <- update(endometrialML, method = "brglmFit",
                        type = "correction")
endometrialBR_median <- update(endometrialML, method = "brglmFit",
                               type = "AS_median")
summary(endometrialML)
summary(endometrialBC)
summary(endometrialBR_mean)
summary(endometrialBR_median)

}
\references{
Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness
and shrinkage in binomial-response generalized linear
models. \emph{Biometrika}, \strong{108}, 71-82. \doi{10.1093/biomet/asaa052}.

Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias
reduction in generalized linear models. \emph{Statistics and Computing},
\strong{30}, 43-59. \doi{10.1007/s11222-019-09860-6}.

Cordeiro G M, McCullagh P (1991). Bias correction in generalized
linear models. \emph{Journal of the Royal Statistical Society. Series B
(Methodological)}, \strong{53}, 629-643. \doi{10.1111/j.2517-6161.1991.tb01852.x}.

Firth D (1993). Bias reduction of maximum likelihood estimates.
\emph{Biometrika}. \strong{80}, 27-38. \doi{10.2307/2336755}.

Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias
reduction of maximum likelihood estimates. \emph{Biometrika}, \strong{104},
923–938. \doi{10.1093/biomet/asx046}.

Kosmidis I, Firth D (2009). Bias reduction in exponential family
nonlinear models. \emph{Biometrika}, \strong{96}, 793-804. \doi{10.1093/biomet/asp055}.

Kosmidis I, Firth D (2010). A generic algorithm for reducing
bias in parametric estimation. \emph{Electronic Journal of Statistics},
\strong{4}, 1097-1112. \doi{10.1214/10-EJS579}.

Kosmidis I (2014). Bias in parametric estimation: reduction and
useful side-effects. \emph{WIRE Computational Statistics}, \strong{6},
185-196. \doi{10.1002/wics.1296}.
}
\seealso{
\code{\link[=brglmControl]{brglmControl()}}, \code{\link[=glm.fit]{glm.fit()}}, \code{\link[=glm]{glm()}}
}
\author{
Ioannis Kosmidis \verb{[aut, cre]} \email{ioannis.kosmidis@warwick.ac.uk}, Euloge Clovis Kenne Pagui \verb{[ctb]} \email{kenne@stat.unipd.it}
}