File: standardize.default.Rd

package info (click to toggle)
r-cran-datawizard 1.0.1%2Bdfsg-1
links: PTS, VCS
area: main
in suites: forky, sid, trixie
size: 2,300 kB
sloc: sh: 13; makefile: 2
file content (100 lines) | stat: -rw-r--r-- 4,303 bytes
parent folder | download | duplicates (2)
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/standardize.models.R
\name{standardize.default}
\alias{standardize.default}
\alias{standardize_models}
\title{Re-fit a model with standardized data}
\usage{
\method{standardize}{default}(
  x,
  robust = FALSE,
  two_sd = FALSE,
  weights = TRUE,
  verbose = TRUE,
  include_response = TRUE,
  ...
)
}
\arguments{
\item{x}{A statistical model.}

\item{robust}{Logical, if \code{TRUE}, centering is done by subtracting the
median from the variables and dividing it by the median absolute deviation
(MAD). If \code{FALSE}, variables are standardized by subtracting the
mean and dividing it by the standard deviation (SD).}

\item{two_sd}{If \code{TRUE}, the variables are scaled by two times the deviation
(SD or MAD depending on \code{robust}). This method can be useful to obtain
model coefficients of continuous parameters comparable to coefficients
related to binary predictors, when applied to \strong{the predictors} (not the
outcome) (Gelman, 2008).}

\item{weights}{If \code{TRUE} (default), a weighted-standardization is carried out.}

\item{verbose}{Toggle warnings and messages on or off.}

\item{include_response}{If \code{TRUE} (default), the response value will also be
standardized. If \code{FALSE}, only the predictors will be standardized.
\itemize{
\item Note that for GLMs and models with non-linear link functions, the
response value will not be standardized, to make re-fitting the model work.
\item If the model contains an \code{\link[stats:offset]{stats::offset()}}, the offset variable(s) will
be standardized only if the response is standardized. If \code{two_sd = TRUE},
offsets are standardized by one-sd (similar to the response).
\item (For \code{mediate} models, the \code{include_response} refers to the outcome in
the y model; m model's response will always be standardized when possible).
}}

\item{...}{Arguments passed to or from other methods.}
}
\value{
A statistical model fitted on standardized data
}
\description{
Performs a standardization of data (z-scoring) using
\code{\link[=standardize]{standardize()}} and then re-fits the model to the standardized data.
\cr\cr
Standardization is done by completely refitting the model on the standardized
data. Hence, this approach is equal to standardizing the variables \emph{before}
fitting the model and will return a new model object. This method is
particularly recommended for complex models that include interactions or
transformations (e.g., polynomial or spline terms). The \code{robust} (default to
\code{FALSE}) argument enables a robust standardization of data, based on the
\code{median} and the \code{MAD} instead of the \code{mean} and the \code{SD}.
}
\section{Generalized Linear Models}{
Standardization for generalized linear models (GLM, GLMM, etc) is done only
with respect to the predictors (while the outcome remains as-is,
unstandardized) - maintaining the interpretability of the coefficients (e.g.,
in a binomial model: the exponent of the standardized parameter is the OR of
a change of 1 SD in the predictor, etc.)
}

\section{Dealing with Factors}{
\code{standardize(model)} or \code{standardize_parameters(model, method = "refit")} do
\emph{not} standardize categorical predictors (i.e. factors) / their
dummy-variables, which may be a different behaviour compared to other R
packages (such as \strong{lm.beta}) or other software packages (like SPSS). To
mimic such behaviours, either use \code{standardize_parameters(model, method = "basic")} to obtain post-hoc standardized parameters, or standardize the data
with \code{standardize(data, force = TRUE)} \emph{before} fitting the
model.
}

\section{Transformed Variables}{
When the model's formula contains transformations (e.g. \code{y ~ exp(X)}) the
transformation effectively takes place after standardization (e.g.,
\code{exp(scale(X))}). Since some transformations are undefined for none positive
values, such as \code{log()} and \code{sqrt()}, the relevel variables are shifted (post
standardization) by \code{Z - min(Z) + 1} or \code{Z - min(Z)} (respectively).
}

\examples{
model <- lm(Infant.Mortality ~ Education * Fertility, data = swiss)
coef(standardize(model))

}
\seealso{
Other standardize: 
\code{\link{standardize}()}
}
\concept{standardize}