1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/standardize.models.R
\name{standardize.default}
\alias{standardize.default}
\alias{standardize_models}
\title{Re-fit a model with standardized data}
\usage{
\method{standardize}{default}(
x,
robust = FALSE,
two_sd = FALSE,
weights = TRUE,
verbose = TRUE,
include_response = TRUE,
...
)
}
\arguments{
\item{x}{A statistical model.}
\item{robust}{Logical, if \code{TRUE}, centering is done by subtracting the
median from the variables and dividing it by the median absolute deviation
(MAD). If \code{FALSE}, variables are standardized by subtracting the
mean and dividing it by the standard deviation (SD).}
\item{two_sd}{If \code{TRUE}, the variables are scaled by two times the deviation
(SD or MAD depending on \code{robust}). This method can be useful to obtain
model coefficients of continuous parameters comparable to coefficients
related to binary predictors, when applied to \strong{the predictors} (not the
outcome) (Gelman, 2008).}
\item{weights}{If \code{TRUE} (default), a weighted-standardization is carried out.}
\item{verbose}{Toggle warnings and messages on or off.}
\item{include_response}{If \code{TRUE} (default), the response value will also be
standardized. If \code{FALSE}, only the predictors will be standardized.
\itemize{
\item Note that for GLMs and models with non-linear link functions, the
response value will not be standardized, to make re-fitting the model work.
\item If the model contains an \code{\link[stats:offset]{stats::offset()}}, the offset variable(s) will
be standardized only if the response is standardized. If \code{two_sd = TRUE},
offsets are standardized by one-sd (similar to the response).
\item (For \code{mediate} models, the \code{include_response} refers to the outcome in
the y model; m model's response will always be standardized when possible).
}}
\item{...}{Arguments passed to or from other methods.}
}
\value{
A statistical model fitted on standardized data
}
\description{
Performs a standardization of data (z-scoring) using
\code{\link[=standardize]{standardize()}} and then re-fits the model to the standardized data.
\cr\cr
Standardization is done by completely refitting the model on the standardized
data. Hence, this approach is equal to standardizing the variables \emph{before}
fitting the model and will return a new model object. This method is
particularly recommended for complex models that include interactions or
transformations (e.g., polynomial or spline terms). The \code{robust} (default to
\code{FALSE}) argument enables a robust standardization of data, based on the
\code{median} and the \code{MAD} instead of the \code{mean} and the \code{SD}.
}
\section{Generalized Linear Models}{
Standardization for generalized linear models (GLM, GLMM, etc) is done only
with respect to the predictors (while the outcome remains as-is,
unstandardized) - maintaining the interpretability of the coefficients (e.g.,
in a binomial model: the exponent of the standardized parameter is the OR of
a change of 1 SD in the predictor, etc.)
}
\section{Dealing with Factors}{
\code{standardize(model)} or \code{standardize_parameters(model, method = "refit")} do
\emph{not} standardize categorical predictors (i.e. factors) / their
dummy-variables, which may be a different behaviour compared to other R
packages (such as \strong{lm.beta}) or other software packages (like SPSS). To
mimic such behaviours, either use \code{standardize_parameters(model, method = "basic")} to obtain post-hoc standardized parameters, or standardize the data
with \code{standardize(data, force = TRUE)} \emph{before} fitting the
model.
}
\section{Transformed Variables}{
When the model's formula contains transformations (e.g. \code{y ~ exp(X)}) the
transformation effectively takes place after standardization (e.g.,
\code{exp(scale(X))}). Since some transformations are undefined for none positive
values, such as \code{log()} and \code{sqrt()}, the relevel variables are shifted (post
standardization) by \code{Z - min(Z) + 1} or \code{Z - min(Z)} (respectively).
}
\examples{
model <- lm(Infant.Mortality ~ Education * Fertility, data = swiss)
coef(standardize(model))
}
\seealso{
Other standardize:
\code{\link{standardize}()}
}
\concept{standardize}
|