File: tableby.Rd

package info (click to toggle)
r-cran-arsenal 3.6.3-2
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 2,788 kB
  • sloc: sh: 18; makefile: 5
file content (145 lines) | stat: -rw-r--r-- 6,199 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tableby.R
\name{tableby}
\alias{tableby}
\title{Summary Statistics of a Set of Independent Variables by a Categorical Variable}
\usage{
tableby(
  formula,
  data,
  na.action,
  subset = NULL,
  weights = NULL,
  strata,
  control = NULL,
  ...
)
}
\arguments{
\item{formula}{an object of class \code{\link{formula}}; a symbolic description of the variables to be summarized by the group,
or categorical variable, of interest. See "Details" for more information. To only view overall summary
statistics, a one-sided formula can be used.}

\item{data}{an optional data frame, list or environment (or object coercible by \code{\link{as.data.frame}} to a data frame)
containing the variables in the model. If not found in data, the variables are taken from \code{environment(formula)},
typically the environment from which the function is called.}

\item{na.action}{a function which indicates what should happen when the data contain \code{NA}s.
The default is \code{na.tableby(TRUE)} if there is a by-variable, and \code{na.tableby(FALSE)} if there is not.
This schema thus includes observations with \code{NA}s in x variables,
but removes those with \code{NA} in the categorical group variable and strata (if used).}

\item{subset}{an optional vector specifying a subset of observations (rows of data) to be used in the results.
Works as vector of logicals or an index.}

\item{weights}{a vector of weights. Using weights will disable statistical tests.}

\item{strata}{a vector of strata to separate summaries by an additional group.}

\item{control}{control parameters to handle optional settings within \code{tableby}.
Two aspects of \code{tableby} are controlled with these: test options of RHS variables across levels of the categorical
grouping variable, and x variable summaries within the grouping variable. Arguments for \code{tableby.control}
can be passed to \code{tableby} via the \code{...} argument, but if a control object and \code{...} arguments are both supplied,
the latter are used. See \code{\link{tableby.control}} for more details.}

\item{...}{additional arguments to be passed to internal \code{tableby} functions or \code{\link{tableby.control}}.}
}
\value{
An object with class \code{c("tableby", "arsenal_table")}
}
\description{
Summarize one or more variables (x) by a categorical variable (y). Variables
  on the right side of the formula, i.e. independent variables, are summarized by the
  levels of a categorical variable on the left of the formula. Optionally, an appropriate test is performed to test the
  distribution of the independent variables across the levels of the categorical variable.
}
\details{
The group variable (if any) is categorical, which could be an integer, character,
factor, or ordered factor. \code{tableby} makes a simple summary of
the counts within the k-levels of the independent variables on the
right side of the formula. Note that unused levels are dropped.

The \code{data} argument allows data.frames with label attributes for the columns, and those
labels will be used in the summary methods for the \code{tableby} class.

The independent variables are a mixture of types: categorical (discrete),
numeric (continuous), and time to event (survival). These variables
are split by the levels of the group variable (if any), then summarized within
those levels, specific to the variable type. A statistical test is
performed to compare the distribution of the independent variables across the
levels of the grouping variable.

The tests differ by the independent variable type, but can be specified
explicitly in the formula statement or in the control function.
These tests are accepted:
\itemize{
  \item{
    \code{anova}: analysis of variance test; the default test for continuous variables. When
    LHS variable has two levels, equivalent to two-sample t-test.
  }
  \item{
    \code{kwt}: Kruskal-Wallis Rank Test, optional test for continuous
    variables. When LHS variable has two levels, equivalent to Wilcoxon test.
  }
  \item{
    \code{wt}: An explicit Wilcoxon test.
  }
  \item{
    \code{medtest}: A median test.
  }
  \item{
    \code{chisq}: chi-square goodness of fit test for equal counts of a
    categorical variable across categories; the default for categorical
    or factor variables
  }
  \item{
    \code{fe}: Fisher's exact test for categorical variables
  }
  \item{
    \code{trend}: trend test for equal distribution of an ordered variable
    across a categorical variable; the default for ordered factor variables
  }
  \item{
    \code{logrank}: log-rank, the default for time-to-event variables
  }
  \item{
    \code{notest}: no test is performed.
  }
}

To perform a mixture of asymptotic and rank-based tests on two
different continuous variables, an example formula is:
\code{formula = group ~ anova(age) + kwt(height)}. The test settings
in \code{tableby.control} apply to all independent variables of a given type.

The summary statistics reported for each independent variable within the
group variable can be set in \code{\link{tableby.control}}.

Finally, multiple by-variables can be set using \code{list()}. See the examples for more details.
}
\examples{
data(mockstudy)
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
summary(tab1, text=TRUE)

mylabels <- list(sex = "SEX", age ="Age, yrs")
summary(tab1, labelTranslations = mylabels, text=TRUE)

tab3 <- tableby(arm ~ sex + age, data=mockstudy, test=FALSE, total=FALSE,
                numeric.stats=c("median","q1q3"), numeric.test="kwt")
summary(tab3, text=TRUE)

# multiple LHS
summary(tableby(list(arm, sex) ~ age, data = mockstudy, strata = ps), text = TRUE)

tab.test <- tableby(arm ~ kwt(age) + anova(bmi) + kwt(ast), data=mockstudy)
tests(tab.test)

}
\seealso{
\code{\link{arsenal_table}}, \code{\link[stats]{anova}}, \code{\link[stats]{chisq.test}}, \code{\link{tableby.control}},
  \code{\link{summary.tableby}}, \code{\link{tableby.internal}}, \code{\link{formulize}}, \code{\link{selectall}}
}
\author{
Jason Sinnwell, Beth Atkinson, Gregory Dougherty, and Ethan Heinzen, adapted from SAS Macros written by Paul Novotny and Ryan Lennon
}