File: lmrob.Rd

package info (click to toggle)
robustbase 0.8-1-1-1
  • links: PTS
  • area: main
  • in suites: wheezy
  • size: 3,156 kB
  • sloc: fortran: 2,553; ansic: 2,419; makefile: 1
file content (166 lines) | stat: -rw-r--r-- 7,325 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
\name{lmrob}
\alias{lmrob}
\title{MM-type Estimators for Linear Regression}
\description{
  Computes fast MM-type estimators for linear (regression) models.
}
\usage{
lmrob(formula, data, subset, weights, na.action, method = "MM",
      model = TRUE, x = !control$compute.rd, y = FALSE,
      singular.ok = TRUE, contrasts = NULL, offset = NULL,
      control = NULL, ...)
}
\arguments{
  \item{formula}{a symbolic description of the model to be fit.  See
    \code{\link{lm}} and \code{\link{formula}} for more details.}

  \item{data}{an optional data frame, list or environment (or object
    coercible by \code{\link{as.data.frame}} to a data frame) containing
    the variables in the model.  If not found in \code{data}, the
    variables are taken from \code{environment(formula)},
    typically the environment from which \code{lmrob} is called.}

  \item{subset}{an optional vector specifying a subset of observations
    to be used in the fitting process.}

  \item{weights}{an optional vector of weights to be used
    in the fitting process. %%% If specified, weighted least squares is used
    %%% with weights \code{weights} (that is, minimizing \code{sum(w*e^2)});
    %%% otherwise ordinary least squares is used.
    }
  \item{na.action}{a function which indicates what should happen
    when the data contain \code{NA}s.  The default is set by
    the \code{na.action} setting of \code{\link{options}}, and is
    \code{\link{na.fail}} if that is unset.  The \dQuote{factory-fresh}
    default is \code{\link{na.omit}}.  Another possible value is
    \code{NULL}, no action.  Value \code{\link{na.exclude}} can be useful.}

  \item{method}{string specifying the estimator-chain. \code{MM}
    is interpreted as \code{SM}. See \emph{Details}.}

  \item{model, x, y}{logicals.  If \code{TRUE} the corresponding
    components of the fit (the model frame, the model matrix, the
    response) are returned.}

  \item{singular.ok}{logical. If \code{FALSE} (the default in S but
    not in \R) a singular fit is an error.}

  \item{contrasts}{an optional list.  See the \code{contrasts.arg}
    of \code{\link{model.matrix.default}}.}

  \item{offset}{this can be used to specify an \emph{a priori}
    known component to be included in the linear predictor
    during fitting.  An \code{\link{offset}} term can be included in the
    formula instead or as well, and if both are specified their sum is used.}

  \item{control}{a \code{\link{list}} specifying control parameters; use
    the function \code{\link{lmrob.control}(.)} and see its help page.}
  \item{\dots}{can be used to specify control parameters directly
    instead of via \code{control}.}
}
\details{
  This function computes an MM-type regression estimator
  as described in Yohai (1987) and Koller and Stahel (2011).  By default
  it uses a bi-square re-desceding score function, and it returns a
  highly robust and highly efficient estimator (with 50\% breakdown
  point and 95\% asymptotic efficiency for normal errors). The
  computation is carried out by a call to \code{\link{lmrob.fit}()}.

  The argument \code{setting} of \code{\link{lmrob.control}} is provided
  to set alternative defaults as suggested in Koller and Stahel (2011)
  (use \code{setting='KS2011'}). For details, see
  \code{\link{lmrob.control}}.

  As initial estimator it uses an S-estimator (Rousseeuw and Yohai,
  1984) which is computed using the Fast-S algorithm of Salibian-Barrera
  and Yohai (2006), calling the function \code{\link{lmrob.S}}. The
  following chain of estimates is customizable via the \code{method}
  argument of \code{\link{lmrob.control}}. There are currently two types
  of estimates available: \code{M} and \code{D}. The first corresponds
  to the standard M-regression estimate. \code{D} stands for the Design
  Adaptive Scale estimate as proposed in Koller and Stahel (2011). The
  \code{method} argument takes a string that specifies the estimates to
  be calculated as a chain. Setting \code{method='SMDM'} will result in
  an intial S-estimate, followed by an M-estimate, a Design Adaptive
  Scale estimate and a final M-step. For methods involving a
  \code{D}-step, the default psi value of psi is changed to \code{lqq}.

  By default, standard errors are computed using the formulas of Croux,
  Dhaene and Hoorelbeke (2003) (\code{\link{lmrob.control}} option
  \code{cov=".vcov.avar1"}). This method, however, works only for
  MM-estimates. For other \code{method} arguments, the covariance matrix
  estimate used is based on the asymptotic normality of the estimated
  coefficients (\code{cov=".vcov.w"}) as described in Koller and Stahel
  (2011).
}
\value{
  An object of class \code{lmrob}. A list that includes the
  following components:
  \item{coefficients}{The estimate of the coefficient vector}
  \item{init.S}{The list returned by \code{\link{lmrob.S} (for
      MM-estimates only}}
  \item{init}{A similar list that contains the results of intermediate
  estimates (not for MM-estimates).}
  \item{scale}{The scale as used in the M estimator.}
  \item{cov}{The estimated covariance matrix of the regression coefficients}
  \item{residuals}{Residuals associated with the estimator}
  \item{fitted.values}{Fitted values associated with the estimator}
  \item{weights}{the \dQuote{robustness weights} \eqn{\psi(r_i/S) / (r_i/S)}.}
  \item{converged}{\code{TRUE} if the IRWLS iterations have converged}
}
\references{
  Croux, C., Dhaene, G. and Hoorelbeke, D. (2003)
  \emph{Robust standard errors for robust estimators},
  Discussion Papers Series 03.16, K.U. Leuven, CES.

  Koller, M. and Stahel, W.A. (2011), Sharpening Wald-type inference in
  robust regression for small samples, \emph{Computational Statistics &
  Data Analysis} \bold{55}(8), 2504--2515.

  Rousseeuw, P.J. and Yohai, V.J. (1984)
  Robust regression by means of S-estimators,
  In \emph{Robust and Nonlinear Time Series},
  J. Franke, W. H\"ardle and R. D. Martin (eds.).
  Lectures Notes in Statistics 26, 256--272,
  Springer Verlag, New York.

  Salibian-Barrera, M. and Yohai, V.J. (2006)
  A fast algorithm for S-regression estimates,
  \emph{Journal of Computational and Graphical Statistics},
  \bold{15}(2), 414--427.

  Yohai, V.J. (1987)
  High breakdown-point and high efficiency estimates for regression.
  \emph{The Annals of Statistics} \bold{15}, 642--65.
}
\author{ Matias Salibian-Barrera and Manuel Koller}
\seealso{
  \code{\link{lmrob.control}};
  for the algorithms \code{\link{lmrob.S}} and \code{\link{lmrob.fit}};
  and for methods,
  \code{\link{predict.lmrob}}, \code{\link{summary.lmrob}},
  \code{\link{print.lmrob}}, and \code{\link{plot.lmrob}}.
  \code{\link{lmrob..M..fit}} for examples on how to use a custom
  initial estimate.
}
\examples{
data(coleman)
summary( m1 <- lmrob(Y ~ ., data=coleman) )
summary( m2 <- lmrob(Y ~ ., data=coleman, setting = 'KS2011') )

data(starsCYG, package = "robustbase")
## Plot simple data and fitted lines
plot(starsCYG)
  lmST <-    lm(log.light ~ log.Te, data = starsCYG)
(RlmST <- lmrob(log.light ~ log.Te, data = starsCYG))
abline(lmST, col = "red")
abline(RlmST, col = "blue")
summary(RlmST)
vcov(RlmST)
stopifnot(all.equal(fitted(RlmST),
                    predict(RlmST, newdata = starsCYG),
                    tol = 1e-14))

}
\keyword{robust}
\keyword{regression}