File: removeConstantFeatures.Rd

package info (click to toggle)
r-cran-mlr 2.19.2%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 8,264 kB
  • sloc: ansic: 65; sh: 13; makefile: 5
file content (66 lines) | stat: -rw-r--r-- 2,192 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/removeConstantFeatures.R
\name{removeConstantFeatures}
\alias{removeConstantFeatures}
\title{Remove constant features from a data set.}
\usage{
removeConstantFeatures(
  obj,
  perc = 0,
  dont.rm = character(0L),
  na.ignore = FALSE,
  wrap.tol = .Machine$double.eps^0.5,
  show.info = getMlrOption("show.info"),
  ...
)
}
\arguments{
\item{obj}{(\link{data.frame} | \link{Task})\cr
Input data.}

\item{perc}{(\code{numeric(1)})\cr
The percentage of a feature values in [0, 1) that must differ from the mode value.
Default is 0, which means only constant features with exactly one observed level are removed.}

\item{dont.rm}{(\link{character})\cr
Names of the columns which must not be deleted.
Default is no columns.}

\item{na.ignore}{(\code{logical(1)})\cr
Should NAs be ignored in the percentage calculation?
(Or should they be treated as a single, extra level in the percentage calculation?)
Note that if the feature has only missing values, it is always removed.
Default is \code{FALSE}.}

\item{wrap.tol}{(\code{numeric(1)})\cr
Numerical tolerance to treat two numbers as equal.
Variables stored as \code{double} will get rounded accordingly before computing the mode.
Default is \code{sqrt(.Maschine$double.eps)}.}

\item{show.info}{(\code{logical(1)})\cr
Print verbose output on console?
Default is set via \link{configureMlr}.}

\item{...}{To ensure backward compatibility with old argument \code{tol}}
}
\value{
\link{data.frame} | \link{Task}. Same type as \code{obj}.
}
\description{
Constant features can lead to errors in some models and obviously provide
no information in the training set that can be learned from.
With the argument \dQuote{perc}, there is a possibility to also remove
features for which less than \dQuote{perc} percent of the observations
differ from the mode value.
}
\seealso{
Other eda_and_preprocess: 
\code{\link{capLargeValues}()},
\code{\link{createDummyFeatures}()},
\code{\link{dropFeatures}()},
\code{\link{mergeSmallFactorLevels}()},
\code{\link{normalizeFeatures}()},
\code{\link{summarizeColumns}()},
\code{\link{summarizeLevels}()}
}
\concept{eda_and_preprocess}