File: ltable.Rd

package info (click to toggle)
r-cran-popepi 0.4.13%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,656 kB
  • sloc: sh: 13; makefile: 2
file content (165 lines) | stat: -rw-r--r-- 5,727 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ltable.R
\name{ltable}
\alias{ltable}
\alias{expr.by.cj}
\title{Tabulate Counts and Other Functions by Multiple Variables into a
Long-Format Table}
\usage{
ltable(
  data,
  by.vars = NULL,
  expr = list(obs = .N),
  subset = NULL,
  use.levels = TRUE,
  na.rm = FALSE,
  robust = TRUE
)

expr.by.cj(
  data,
  by.vars = NULL,
  expr = list(obs = .N),
  subset = NULL,
  use.levels = FALSE,
  na.rm = FALSE,
  robust = FALSE,
  .SDcols = NULL,
  enclos = parent.frame(1L),
  ...
)
}
\arguments{
\item{data}{a \code{data.table}/\code{data.frame}}

\item{by.vars}{names of variables that are used for categorization,
as a character vector, e.g. \code{c('sex','agegroup')}}

\item{expr}{object or a list of objects where each object is a function
of a variable (see: details)}

\item{subset}{a logical condition; data is limited accordingly before
evaluating \code{expr} - but the result of \code{expr} is also
returned as \code{NA} for levels not existing in the subset. See Examples.}

\item{use.levels}{logical; if \code{TRUE}, uses factor levels of given
variables if present;  if you want e.g. counts for levels
that actually have zero observations but are levels in a factor variable,
use this}

\item{na.rm}{logical; if \code{TRUE}, drops rows in table that have
\code{NA} as values in any of \code{by.vars} columns}

\item{robust}{logical; if \code{TRUE}, runs the output data's
\code{by.vars} columns through \code{robust_values} before outputting}

\item{.SDcols}{advanced; a character vector of column names
passed to inside the data.table's brackets
\code{DT[, , ...]}; see \verb{[data.table::data.table]}; if \code{NULL},
uses all appropriate columns. See Examples for usage.}

\item{enclos}{advanced; an environment; the enclosing
environment of the data.}

\item{...}{advanced; other arguments passed to inside the
data.table's brackets \code{DT[, , ...]}; see \verb{[data.table::data.table]}}
}
\value{
A \code{data.table} of statistics (e.g. counts) stratified by the columns defined
in \code{by.vars}.
}
\description{
\code{ltable} makes use of \code{data.table}
capabilities to tabulate frequencies or
arbitrary functions of given variables into a long format
\code{data.table}/\code{data.frame}. \code{expr.by.cj} is the
equivalent for more advanced users.
}
\details{
Returns \code{expr} for each unique combination of given \code{by.vars}.

By default makes use of any and all \verb{[levels]} present for
each variable in  \code{by.vars}. This is useful,
because even if a subset of the data does not contain observations
for e.g. a specific age group, those age groups are
nevertheless presented in the resulting table; e.g. with the default
\code{expr = list(obs = .N)} all age group levels
are represented by a row and can have  \code{obs = 0}.

The function differs from the
vanilla \verb{[table]} by giving a long format table of values
regardless of the number of \code{by.vars} given.
Make use of e.g. \verb{[cast_simple]} if data needs to be
presented in a wide format (e.g. a two-way table).

The rows of the long-format table are effectively Cartesian products
of the levels of each variable in  \code{by.vars},
e.g. with  \code{by.vars = c("sex", "area")} all levels of
\code{area} are repeated for both levels of  \code{sex}
in the table.

The \code{expr} allows the user to apply any function(s) on all
levels defined by  \code{by.vars}. Here are some examples:
\itemize{
\item .N or list(.N) is a function used inside a \code{data.table} to
calculate counts in each group
\item list(obs = .N), same as above but user assigned variable name
\item list(sum(obs), sum(pyrs), mean(dg_age)), multiple objects in a list
\item list(obs = sum(obs), pyrs = sum(pyrs)), same as above with user
defined variable names
}

If  \code{use.levels = FALSE}, no \code{levels} information will
be used. This means that if e.g. the  \code{agegroup}
variable is a factor and has 18 levels defined, but only 15 levels
are present in the data, no rows for the missing
levels will be shown in the table.

\code{na.rm} simply drops any rows from the resulting table where
any of the  \code{by.vars} values was \code{NA}.
}
\section{Functions}{
\itemize{
\item \code{expr.by.cj()}: Somewhat more streamlined \code{ltable} with
defaults for speed. Explicit determination of enclosing environment
of data.

}}
\examples{
data("sire", package = "popEpi")
sr <- sire
sr$agegroup <- cut(sr$dg_age, breaks=c(0,45,60,75,85,Inf))
## counts by default
ltable(sr, "agegroup")

## any expression can be given
ltable(sr, "agegroup", list(mage = mean(dg_age)))
ltable(sr, "agegroup", list(mage = mean(dg_age), vage = var(dg_age)))

## also returns levels where there are zero rows (expressions as NA)
ltable(sr, "agegroup", list(obs = .N,
                            minage = min(dg_age),
                            maxage = max(dg_age)),
       subset = dg_age < 85)

#### expr.by.cj
expr.by.cj(sr, "agegroup")

## any arbitrary expression can be given
expr.by.cj(sr, "agegroup", list(mage = mean(dg_age)))
expr.by.cj(sr, "agegroup", list(mage = mean(dg_age), vage = var(dg_age)))

## only uses levels of by.vars present in data
expr.by.cj(sr, "agegroup", list(mage = mean(dg_age), vage = var(dg_age)),
           subset = dg_age < 70)

## .SDcols trick
expr.by.cj(sr, "agegroup", lapply(.SD, mean),
           subset = dg_age < 70, .SDcols = c("dg_age", "status"))
}
\seealso{
\verb{[table]}, \verb{[cast_simple]}, \verb{[data.table::melt]}
}
\author{
Joonas Miettinen, Matti Rantanen
}