File: splitMulti.Rd

package info (click to toggle)
r-cran-popepi 0.4.13%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,656 kB
  • sloc: sh: 13; makefile: 2
file content (146 lines) | stat: -rw-r--r-- 4,990 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/splitMulti.R
\name{splitMulti}
\alias{splitMulti}
\title{Split case-level observations}
\usage{
splitMulti(
  data,
  breaks = NULL,
  ...,
  drop = TRUE,
  merge = TRUE,
  verbose = FALSE
)
}
\arguments{
\item{data}{a Lexis object with event cases as rows}

\item{breaks}{a list of named numeric vectors of breaks; see Details and Examples}

\item{...}{alternate way of supplying breaks as named vectors;
e.g. \code{fot = 0:5} instead of \code{breaks = list(fot = 0:5)};
if \code{breaks} is not \code{NULL}, \code{breaks} is used and any breaks
passed through \code{...} are NOT used; note also that due to partial
matching of argument names in R,
if you supply e.g. \code{dat = my_breaks} and you
do not pass argument \code{data} explicitly (\code{data = my_data}), then R
interprets this as \code{data = my_breaks} --- so choose the names of your
time scales wisely}

\item{drop}{logical; if \code{TRUE}, drops all resulting rows
after expansion that reside outside the time window
defined by the given breaks}

\item{merge}{logical; if \code{TRUE}, retains all variables
from the original data - i.e. original variables are
repeated for all the rows by original subject}

\item{verbose}{logical; if \code{TRUE}, the function is chatty
and returns some messages along the way}
}
\value{
A \code{data.table} or \code{data.frame}
(depending on \code{options("popEpi.datatable")}; see \code{?popEpi})
object expanded to accommodate split observations.
}
\description{
Split a \code{Lexis} object along multiple time scales
with speed and ease
}
\details{
\code{splitMulti} is in essence a \pkg{data.table} version of
\code{splitLexis} or \code{survSplit} for splitting along multiple
time scales.
It requires a Lexis object as input.

The \code{breaks} must be a list of named vectors of the appropriate type.
The breaks are fully explicit and
left-inclusive and right exclusive, e.g. \code{fot=c(0,5)}
forces the data to only include time between
\verb{[0,5)} for each original row (unless \code{drop = FALSE}).
Use \code{Inf} or \code{-Inf} for open-ended intervals,
e.g. \code{per=c(1990,1995,Inf)} creates the intervals
\verb{[1990,1995), [1995, Inf)}.

Instead of specifying \code{breaks}, one may make use of the \code{...}
argument to pass breaks: e.g.

\code{splitMulti(x, breaks = list(fot = 0:5))}

is equivalent to

\code{splitMulti(x, fot = 0:5)}.

Multiple breaks can be supplied in the same manner. However, if both
\code{breaks} and \code{...} are used, only the breaks in \code{breaks}
are utilized within the function.

The \code{Lexis} time scale variables can be of any arbitrary
format, e.g. \code{Date},
fractional years (see \verb{[Epi::cal.yr]}) and \verb{[get.yrs]},
or other.
}
\examples{
#### let's prepare data for computing period method survivals
#### in case there are problems with dates, we first
#### convert to fractional years.
\donttest{
library("Epi")
library("data.table")
data("sire", package = "popEpi")
x <- Lexis(data=sire[dg_date < ex_date, ],
           entry = list(fot=0, per=get.yrs(dg_date), age=dg_age),
           exit=list(per=get.yrs(ex_date)), exit.status=status)
x2 <- splitMulti(x, breaks = list(fot=seq(0, 5, by = 3/12), per=c(2008, 2013)))
# equivalently:
x2 <- splitMulti(x, fot=seq(0, 5, by = 3/12), per=c(2008, 2013))

## using dates; note: breaks must be expressed as dates or days!
x <- Lexis(data=sire[dg_date < ex_date, ],
           entry = list(fot=0, per=dg_date, age=dg_date-bi_date),
           exit=list(per=ex_date), exit.status=status)
BL <- list(fot = seq(0, 5, by = 3/12)*365.242199,
           per = as.Date(paste0(c(1980:2014),"-01-01")),
           age = c(0,45,85,Inf)*365.242199)
x2 <- splitMulti(x, breaks = BL, verbose=TRUE)


## multistate example (healty - sick - dead)
sire2 <- data.frame(sire)
sire2 <- sire2[sire2$dg_date < sire2$ex_date, ]

set.seed(1L)
not_sick <- sample.int(nrow(sire2), 6000L, replace = FALSE)
sire2$dg_date[not_sick] <- NA
sire2$status[!is.na(sire2$dg_date) & sire2$status == 0] <- -1

sire2$status[sire2$status==2] <- 1
sire2$status <- factor(sire2$status, levels = c(0, -1, 1),
                       labels = c("healthy", "sick", "dead"))

xm <- Lexis(data = sire2,
            entry = list(fot=0, per=get.yrs(bi_date), age=0),
            exit = list(per=get.yrs(ex_date)), exit.status=status)
xm2 <- cutLexis(xm, cut = get.yrs(xm$dg_date),
                timescale = "per",
                new.state = "sick")
xm2[xm2$lex.id == 6L, ]

xm2 <- splitMulti(xm2, breaks = list(fot = seq(0,150,25)))
xm2[xm2$lex.id == 6L, ]
}

}
\seealso{
\verb{[Epi::splitLexis]}, \verb{[Epi::Lexis]},
\verb{[survival::survSplit]}

Other splitting functions: 
\code{\link{lexpand}()},
\code{\link{splitLexisDT}()}
}
\author{
Joonas Miettinen
}
\concept{splitting functions}