1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
|
\name{importance}
\alias{importance}
\alias{importance.default}
\alias{importance.randomForest}
\title{Extract variable importance measure}
\description{
This is the extractor function for variable importance measures as
produced by \code{\link{randomForest}}.
}
\usage{
\method{importance}{randomForest}(x, type=NULL, class=NULL, scale=TRUE, ...)
}
\arguments{
\item{x}{an object of class \code{\link{randomForest}}}.
\item{type}{either 1 or 2, specifying the type of importance measure
(1=mean decrease in accuracy, 2=mean decrease in node impurity).}
\item{class}{for classification problem, which class-specific measure
to return.}
\item{scale}{For permutation based measures, should the measures be
divided their ``standard errors''?}
\item{...}{not used.}
}
\value{
A (named) vector of importance measure, one for each predictor variable.
}
\details{
Here are the definitions of the variable importance measures. For
each tree, the prediction accuracy on the out-of-bag portion of the
data is recorded. Then the same is done after permuting each
predictor variable. The difference between the two accuracies are
then averaged over all trees, and normalized by the standard
error. For regression, the MSE is computed on the out-of-bag data for
each tree, and then the same computed after permuting a variable. The
differences are averaged and normalized by the standard error. If the
standard error is equal to 0 for a variable, the division is not done
(but the measure is almost always equal to 0 in that case).
The second measure is the total decrease in node impurities from
splitting on the variable, averaged over all trees. For
classification, the node impurity is measured by the Gini index.
For regression, it is measured by residual sum of squares.
}
%\references{
%}
\seealso{
\code{\link{randomForest}}, \code{\link{varImpPlot}}
}
\examples{
set.seed(4543)
data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000,
keep.forest=FALSE, importance=TRUE)
importance(mtcars.rf)
importance(mtcars.rf, type=1)
}
%\author{}
\keyword{regression}
\keyword{classif}
\keyword{tree}
|