1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/across.R
\name{across}
\alias{across}
\alias{if_any}
\alias{if_all}
\title{Apply a function (or functions) across multiple columns}
\usage{
across(.cols = everything(), .fns = NULL, ..., .names = NULL)
if_any(.cols = everything(), .fns = NULL, ..., .names = NULL)
if_all(.cols = everything(), .fns = NULL, ..., .names = NULL)
}
\arguments{
\item{.cols, cols}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> Columns to transform.
Because \code{across()} is used within functions like \code{summarise()} and
\code{mutate()}, you can't select or compute upon grouping variables.}
\item{.fns}{Functions to apply to each of the selected columns.
Possible values are:
\itemize{
\item A function, e.g. \code{mean}.
\item A purrr-style lambda, e.g. \code{~ mean(.x, na.rm = TRUE)}
\item A list of functions/lambdas, e.g.
\verb{list(mean = mean, n_miss = ~ sum(is.na(.x))}
\item \code{NULL}: the default value, returns the selected columns in a data
frame without applying a transformation. This is useful for when you want to
use a function that takes a data frame.
}
Within these functions you can use \code{\link[=cur_column]{cur_column()}} and \code{\link[=cur_group]{cur_group()}}
to access the current column and grouping keys respectively.}
\item{...}{Additional arguments for the function calls in \code{.fns}. Using these
\code{...} is strongly discouraged because of issues of timing of evaluation.}
\item{.names}{A glue specification that describes how to name the output
columns. This can use \code{{.col}} to stand for the selected column name, and
\code{{.fn}} to stand for the name of the function being applied. The default
(\code{NULL}) is equivalent to \code{"{.col}"} for the single function case and
\code{"{.col}_{.fn}"} for the case where a list is used for \code{.fns}.}
}
\value{
\code{across()} returns a tibble with one column for each column in \code{.cols} and each function in \code{.fns}.
\code{if_any()} and \code{if_all()} return a logical vector.
}
\description{
\code{across()} makes it easy to apply the same transformation to multiple
columns, allowing you to use \code{\link[=select]{select()}} semantics inside in "data-masking"
functions like \code{\link[=summarise]{summarise()}} and \code{\link[=mutate]{mutate()}}. See \code{vignette("colwise")} for
more details.
\code{if_any()} and \code{if_all()} apply the same
predicate function to a selection of columns and combine the
results into a single logical vector: \code{if_any()} is \code{TRUE} when
the predicate is \code{TRUE} for \emph{any} of the selected columns, \code{if_all()}
is \code{TRUE} when the predicate is \code{TRUE} for \emph{all} selected columns.
\code{across()} supersedes the family of "scoped variants" like
\code{summarise_at()}, \code{summarise_if()}, and \code{summarise_all()}.
}
\section{Timing of evaluation}{
R code in dplyr verbs is generally evaluated once per group.
Inside \code{across()} however, code is evaluated once for each
combination of columns and groups. If the evaluation timing is
important, for example if you're generating random variables, think
about when it should happen and place your code in consequence.
\if{html}{\out{<div class="sourceCode r">}}\preformatted{gdf <-
tibble(g = c(1, 1, 2, 3), v1 = 10:13, v2 = 20:23) \%>\%
group_by(g)
set.seed(1)
# Outside: 1 normal variate
n <- rnorm(1)
gdf \%>\% mutate(across(v1:v2, ~ .x + n))
#> # A tibble: 4 x 3
#> # Groups: g [3]
#> g v1 v2
#> <dbl> <dbl> <dbl>
#> 1 1 9.37 19.4
#> 2 1 10.4 20.4
#> 3 2 11.4 21.4
#> 4 3 12.4 22.4
# Inside a verb: 3 normal variates (ngroup)
gdf \%>\% mutate(n = rnorm(1), across(v1:v2, ~ .x + n))
#> # A tibble: 4 x 4
#> # Groups: g [3]
#> g v1 v2 n
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 10.2 20.2 0.184
#> 2 1 11.2 21.2 0.184
#> 3 2 11.2 21.2 -0.836
#> 4 3 14.6 24.6 1.60
# Inside `across()`: 6 normal variates (ncol * ngroup)
gdf \%>\% mutate(across(v1:v2, ~ .x + rnorm(1)))
#> # A tibble: 4 x 3
#> # Groups: g [3]
#> g v1 v2
#> <dbl> <dbl> <dbl>
#> 1 1 10.3 20.7
#> 2 1 11.3 21.7
#> 3 2 11.2 22.6
#> 4 3 13.5 22.7
}\if{html}{\out{</div>}}
}
\examples{
# across() -----------------------------------------------------------------
# Different ways to select the same set of columns
# See <https://tidyselect.r-lib.org/articles/syntax.html> for details
iris \%>\%
as_tibble() \%>\%
mutate(across(c(Sepal.Length, Sepal.Width), round))
iris \%>\%
as_tibble() \%>\%
mutate(across(c(1, 2), round))
iris \%>\%
as_tibble() \%>\%
mutate(across(1:Sepal.Width, round))
iris \%>\%
as_tibble() \%>\%
mutate(across(where(is.double) & !c(Petal.Length, Petal.Width), round))
# A purrr-style formula
iris \%>\%
group_by(Species) \%>\%
summarise(across(starts_with("Sepal"), ~ mean(.x, na.rm = TRUE)))
# A named list of functions
iris \%>\%
group_by(Species) \%>\%
summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd)))
# Use the .names argument to control the output names
iris \%>\%
group_by(Species) \%>\%
summarise(across(starts_with("Sepal"), mean, .names = "mean_{.col}"))
iris \%>\%
group_by(Species) \%>\%
summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd), .names = "{.col}.{.fn}"))
# When the list is not named, .fn is replaced by the function's position
iris \%>\%
group_by(Species) \%>\%
summarise(across(starts_with("Sepal"), list(mean, sd), .names = "{.col}.fn{.fn}"))
# across() returns a data frame, which can be used as input of another function
df <- data.frame(
x1 = c(1, 2, NA),
x2 = c(4, NA, 6),
y = c("a", "b", "c")
)
df \%>\%
mutate(x_complete = complete.cases(across(starts_with("x"))))
df \%>\%
filter(complete.cases(across(starts_with("x"))))
# if_any() and if_all() ----------------------------------------------------
iris \%>\%
filter(if_any(ends_with("Width"), ~ . > 4))
iris \%>\%
filter(if_all(ends_with("Width"), ~ . > 2))
}
\seealso{
\code{\link[=c_across]{c_across()}} for a function that returns a vector
}
|