File: rescale_weights.Rd

package info (click to toggle)
r-cran-datawizard 0.6.5%2Bdfsg-1
links: PTS, VCS
area: main
in suites: bookworm
size: 1,736 kB
sloc: sh: 13; makefile: 2
file content (100 lines) | stat: -rw-r--r-- 4,191 bytes
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rescale_weights.R
\name{rescale_weights}
\alias{rescale_weights}
\title{Rescale design weights for multilevel analysis}
\usage{
rescale_weights(data, group, probability_weights, nest = FALSE)
}
\arguments{
\item{data}{A data frame.}

\item{group}{Variable names (as character vector, or as formula), indicating
the grouping structure (strata) of the survey data (level-2-cluster
variable). It is also possible to create weights for multiple group
variables; in such cases, each created weighting variable will be suffixed
by the name of the group variable.}

\item{probability_weights}{Variable indicating the probability (design or
sampling) weights of the survey data (level-1-weight).}

\item{nest}{Logical, if \code{TRUE} and \code{group} indicates at least two
group variables, then groups are "nested", i.e. groups are now a
combination from each group level of the variables in \code{group}.}
}
\value{
\code{data}, including the new weighting variables: \code{pweights_a}
and \code{pweights_b}, which represent the rescaled design weights to use
in multilevel models (use these variables for the \code{weights} argument).
}
\description{
Most functions to fit multilevel and mixed effects models only
allow to specify frequency weights, but not design (i.e. sampling or
probability) weights, which should be used when analyzing complex samples
and survey data. \code{rescale_weights()} implements an algorithm proposed
by \cite{Asparouhov (2006)} and \cite{Carle (2009)} to rescale design
weights in survey data to account for the grouping structure of multilevel
models, which then can be used for multilevel modelling.
}
\details{
Rescaling is based on two methods: For \code{pweights_a}, the sample weights
\code{probability_weights} are adjusted by a factor that represents the proportion
of group size divided by the sum of sampling weights within each group. The
adjustment factor for \code{pweights_b} is the sum of sample weights within each
group divided by the sum of squared sample weights within each group (see
Carle (2009), Appendix B). In other words, \code{pweights_a} "scales the weights
so that the new weights sum to the cluster sample size" while \code{pweights_b}
"scales the weights so that the new weights sum to the effective cluster
size".

Regarding the choice between scaling methods A and B, Carle suggests that
"analysts who wish to discuss point estimates should report results based on
weighting method A. For analysts more interested in residual between-group
variance, method B may generally provide the least biased estimates". In
general, it is recommended to fit a non-weighted model and weighted models
with both scaling methods and when comparing the models, see whether the
"inferential decisions converge", to gain confidence in the results.

Though the bias of scaled weights decreases with increasing group size,
method A is preferred when insufficient or low group size is a concern.

The group ID and probably PSU may be used as random effects (e.g. nested
design, or group and PSU as varying intercepts), depending on the survey
design that should be mimicked.
}
\examples{
if (require("lme4")) {
  data(nhanes_sample)
  head(rescale_weights(nhanes_sample, "SDMVSTRA", "WTINT2YR"))

  # also works with multiple group-variables
  head(rescale_weights(nhanes_sample, c("SDMVSTRA", "SDMVPSU"), "WTINT2YR"))

  # or nested structures.
  x <- rescale_weights(
    data = nhanes_sample,
    group = c("SDMVSTRA", "SDMVPSU"),
    probability_weights = "WTINT2YR",
    nest = TRUE
  )
  head(x)

  nhanes_sample <- rescale_weights(nhanes_sample, "SDMVSTRA", "WTINT2YR")

  glmer(
    total ~ factor(RIAGENDR) * (log(age) + factor(RIDRETH1)) + (1 | SDMVPSU),
    family = poisson(),
    data = nhanes_sample,
    weights = pweights_a
  )
}
}
\references{
\itemize{
\item Carle A.C. (2009). Fitting multilevel models in complex survey data
with design weights: Recommendations. BMC Medical Research Methodology
9(49): 1-13
\item Asparouhov T. (2006). General Multi-Level Modeling with Sampling
Weights. Communications in Statistics - Theory and Methods 35: 439-460
}
}