File: sshzd.Rd

package info (click to toggle)
r-cran-gss 2.1-3-1
  • links: PTS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 1,740 kB
  • ctags: 1,400
  • sloc: fortran: 5,241; ansic: 1,388; makefile: 1
file content (173 lines) | stat: -rw-r--r-- 7,271 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
\name{sshzd}
\alias{sshzd}
\alias{sshzd1}
\title{Estimating Hazard Function Using Smoothing Splines}
\description{
    Estimate hazard function using smoothing spline ANOVA models.  The
    symbolic model specification via \code{formula} follows the same
    rules as in \code{\link{lm}}, but with the response of a special
    form.
}
\usage{
sshzd(formula, type=NULL, data=list(), alpha=1.4, weights=NULL,
      subset, offset, na.action=na.omit, partial=NULL, id.basis=NULL,
      nbasis=NULL, seed=NULL, random=NULL, prec=1e-7, maxiter=30,
      skip.iter=FALSE)

sshzd1(formula, type=NULL, data=list(), alpha=1.4, weights=NULL,
       subset, na.action=na.omit, rho="marginal", partial=NULL,
       id.basis=NULL, nbasis=NULL, seed=NULL, random=NULL, prec=1e-7,
       maxiter=30, skip.iter=FALSE)
}
\arguments{
    \item{formula}{Symbolic description of the model to be fit, where
        the response is of the form \code{Surv(futime,status,start=0)}.}
    \item{type}{List specifying the type of spline for each variable.
        See \code{\link{mkterm}} for details.}
    \item{data}{Optional data frame containing the variables in the
        model.}
    \item{alpha}{Parameter defining cross-validation score for smoothing
        parameter selection.}
    \item{weights}{Optional vector of counts for duplicated data.}
    \item{subset}{Optional vector specifying a subset of observations
	to be used in the fitting process.}
    \item{offset}{Optional offset term with known parameter 1.}
    \item{na.action}{Function which indicates what should happen when
        the data contain NAs.}
    \item{partial}{Optional symbolic description of parametric terms in
        partial spline models.}
    \item{id.basis}{Index of observations to be used as "knots."}
    \item{nbasis}{Number of "knots" to be used.  Ignored when
        \code{id.basis} is specified.}
    \item{seed}{Seed to be used for the random generation of "knots."
        Ignored when \code{id.basis} is specified.}
    \item{random}{Input for parametric random effects (frailty) in
        nonparametric mixed-effect models.  See \code{\link{mkran}} for
	details.}
    \item{prec}{Precision requirement for internal iterations.}
    \item{maxiter}{Maximum number of iterations allowed for
	internal iterations.}
    \item{skip.iter}{Flag indicating whether to use initial values of
        theta and skip theta iteration.  See \code{\link{ssanova}} for
	notes on skipping theta iteration.}
    \item{rho}{Choice of rho function for sshzd1: \code{"marginal"} or
        \code{"weibull"}.}
}
\details{
    The model specification via \code{formula} is for the log hazard.
    For example, \code{Suve(t,d)~t*u} prescribes a model of the form
    \deqn{
	log f(t,u) = C + g_{t}(t) + g_{u}(u) + g_{t,u}(t,u)
    }
    with the terms denoted by \code{"1"}, \code{"t"}, \code{"u"}, and
    \code{"t:u"}.  Replacing \code{t*u} by \code{t+u} in the
    \code{formula}, one gets a proportional hazard model with
    \eqn{g_{t,u}=0}.

    \code{sshzd} takes standard right-censored lifetime data, with
    possible left-truncation and covariates; in
    \code{Surv(futime,status,start=0)~...}, \code{futime} is the  
    follow-up time, \code{status} is the censoring indicator, and
    \code{start} is the optional left-truncation time.  The main effect
    of \code{futime} must appear in the model terms specified via
    \code{...}.

    Parallel to those in a \code{\link{ssanova}} object, the model terms
    are sums of unpenalized and penalized terms.  Attached to every
    penalized term there is a smoothing parameter, and the model
    complexity is largely determined by the number of smoothing
    parameters.

    The selection of smoothing parameters is through a cross-validation
    mechanism described in Gu (2002, Sec. 7.2), with a parameter
    \code{alpha}; \code{alpha=1} is "unbiased" for the minimization of
    Kullback-Leibler loss but may yield severe undersmoothing, whereas
    larger \code{alpha} yields smoother estimates.

    A subset of the observations are selected as "knots."  Unless
    specified via \code{id.basis} or \code{nbasis}, the number of
    "knots" \eqn{q} is determined by \eqn{max(30,10n^{2/9})}, which is
    appropriate for the default cubic splines for numerical vectors.
}
\note{
    The function \code{Surv(futime,status,start=0)} is defined and
    parsed inside \code{sshzd}, not quite the same as the one in the
    \code{survival} package.

    Integration on the time axis is done by the 200-point Gauss-Legendre
    formula on \code{c(min(start),max(futime))}, returned from
    \code{\link{gauss.quad}}.

    \code{sshzd1} can be up to 50 times faster than \code{sshzd}, at the
    cost of performance degradation.

    The results may vary from run to run.  For consistency, specify
    \code{id.basis} or set \code{seed}.
}
\value{
    \code{sshzd} returns a list object of class \code{"sshzd"}.
    \code{sshzd1} returns a list object of class
    \code{c("sshzd1","sshzd")}.

    \code{\link{hzdrate.sshzd}} can be used to evaluate the estimated
    hazard function.  \code{\link{hzdcurve.sshzd}} can be used to
    evaluate hazard curves with fixed covariates.
    \code{\link{survexp.sshzd}} can be used to calculated estimated
    expected survival.

    The method \code{\link{project.sshzd}} can be used to calculate the
    Kullback-Leibler projection of \code{"sshzd"} objects for model
    selection; \code{\link{project.sshzd1}} can be used to calculate the
    square error projection of \code{"sshzd1"} objects.
}
\author{Chong Gu, \email{chong@stat.purdue.edu}}
\references{
    Du, P. and Gu, C. (2006), Penalized likelihood hazard estimation:
    efficient approximation and Bayesian confidence intervals.
    \emph{Statistics and Probability Letters}, \bold{76}, 244--254.

    Du, P. and Gu, C. (2009), Penalized Pseudo-Likelihood Hazard
    Estimation: A Fast Alternative to Penalized Likelihood.
    \emph{Journal of Statistical Planning and Inference}, \bold{139},
    891--899.

    Du, P. and Ma, S. (2010), Frailty Model with Spline Estimated
    Nonparametric Hazard Function, \emph{Statistica Sinica}, \bold{20},
    561--580.
    
    Gu, C. (2013), \emph{Smoothing Spline ANOVA Models (2nd Ed)}.  New
    York: Springer-Verlag.

    Chong Gu (2014), Smoothing Spline ANOVA Models: R Package gss.
    \emph{Journal of Statistical Software}, 58(5), 1-25. URL
    http://www.jstatsoft.org/v58/i05/.
}
\examples{
## Model with interaction
data(gastric)
gastric.fit <- sshzd(Surv(futime,status)~futime*trt,data=gastric)
## exp(-Lambda(600)), exp(-(Lambda(1200)-Lambda(600))), and exp(-Lambda(1200))
survexp.sshzd(gastric.fit,c(600,1200,1200),data.frame(trt=as.factor(1)),c(0,600,0))
## Clean up
\dontrun{rm(gastric,gastric.fit)
dev.off()}

## THE FOLLOWING EXAMPLE IS TIME-CONSUMING
## Proportional hazard model
\dontrun{
data(stan)
stan.fit <- sshzd(Surv(futime,status)~futime+age,data=stan)
## Evaluate fitted hazard
hzdrate.sshzd(stan.fit,data.frame(futime=c(10,20),age=c(20,30)))
## Plot lambda(t,age=20)
tt <- seq(0,60,leng=101)
hh <- hzdcurve.sshzd(stan.fit,tt,data.frame(age=20))
plot(tt,hh,type="l")
## Clean up
rm(stan,stan.fit,tt,hh)
dev.off()
}
}
\keyword{smooth}
\keyword{models}
\keyword{survival}