File: roles.Rd

package info (click to toggle)
r-cran-recipes 0.1.15%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 2,496 kB
  • sloc: sh: 37; makefile: 2
file content (129 lines) | stat: -rw-r--r-- 4,707 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/roles.R
\name{roles}
\alias{roles}
\alias{add_role}
\alias{update_role}
\alias{remove_role}
\title{Manually Alter Roles}
\usage{
add_role(recipe, ..., new_role = "predictor", new_type = NULL)

update_role(recipe, ..., new_role = "predictor", old_role = NULL)

remove_role(recipe, ..., old_role)
}
\arguments{
\item{recipe}{An existing \code{\link[=recipe]{recipe()}}.}

\item{...}{One or more selector functions to choose which variables are
being assigned a role. See \code{\link[=selections]{selections()}} for more details.}

\item{new_role}{A character string for a single role.}

\item{new_type}{A character string for specific type that the variable should
be identified as. If left as \code{NULL}, the type is automatically identified
as the \emph{first} type you see for that variable in \code{summary(recipe)}.}

\item{old_role}{A character string for the specific role to update for the
variables selected by \code{...}. \code{update_role()} accepts a \code{NULL} as long as the
variables have only a single role.}
}
\value{
An updated recipe object.
}
\description{
\code{update_role()} alters an existing role in the recipe or assigns an initial
role to variables that do not yet have a declared role.

\code{add_role()} adds an \emph{additional} role to variables that already have a role
in the recipe. It does not overwrite old roles, as a single variable can have
multiple roles.

\code{remove_role()} eliminates a single existing role in the recipe.
}
\details{
\code{update_role()} should be used when a variable doesn't currently have a role
in the recipe, or to replace an \code{old_role} with a \code{new_role}. \code{add_role()}
only adds additional roles to variables that already have roles and will
throw an error when the current role is missing (i.e. \code{NA}).

When using \code{add_role()}, if a variable is selected that already has the
\code{new_role}, a warning is emitted and that variable is skipped so no duplicate
roles are added.

Adding or updating roles is a useful way to group certain variables that
don't fall in the standard \code{"predictor"} bucket. You can perform a step
on all of the variables that have a custom role with the selector
\code{\link[=has_role]{has_role()}}.
}
\examples{
library(recipes)
library(modeldata)
data(biomass)

# Using the formula method, roles are created for any outcomes and predictors:
recipe(HHV ~ ., data = biomass) \%>\%
  summary()

# However `sample` and `dataset` aren't predictors. Since they already have
# roles, `update_role()` can be used to make changes:
recipe(HHV ~ ., data = biomass) \%>\%
  update_role(sample, new_role = "id variable") \%>\%
  update_role(dataset, new_role = "splitting variable") \%>\%
  summary()

# `update_role()` cannot set a role to NA, use `remove_role()` for that
\dontrun{
recipe(HHV ~ ., data = biomass) \%>\%
  update_role(sample, new_role = NA_character_)
}

# ------------------------------------------------------------------------------

# Variables can have more than one role. `add_role()` can be used
# if the column already has at least one role:
recipe(HHV ~ ., data = biomass) \%>\%
  add_role(carbon, sulfur, new_role = "something") \%>\%
  summary()

# `update_role()` has an argument called `old_role` that is required to
# unambiguously update a role when the column currently has multiple roles.
recipe(HHV ~ ., data = biomass) \%>\%
  add_role(carbon, new_role = "something") \%>\%
  update_role(carbon, new_role = "something else", old_role = "something") \%>\%
  summary()

# `carbon` has two roles at the end, so the last `update_roles()` fails since
# `old_role` was not given.
\dontrun{
recipe(HHV ~ ., data = biomass) \%>\%
  add_role(carbon, sulfur, new_role = "something") \%>\%
  update_role(carbon, new_role = "something else")
}

# ------------------------------------------------------------------------------

# To remove a role, `remove_role()` can be used to remove a single role.
recipe(HHV ~ ., data = biomass) \%>\%
  add_role(carbon, new_role = "something") \%>\%
  remove_role(carbon, old_role = "something") \%>\%
  summary()

# To remove all roles, call `remove_role()` multiple times to reset to `NA`
recipe(HHV ~ ., data = biomass) \%>\%
  add_role(carbon, new_role = "something") \%>\%
  remove_role(carbon, old_role = "something") \%>\%
  remove_role(carbon, old_role = "predictor") \%>\%
  summary()

# ------------------------------------------------------------------------------

# If the formula method is not used, all columns have a missing role:
recipe(biomass) \%>\%
  summary()

}
\concept{model_specification}
\concept{preprocessing}
\keyword{datagen}