File: across.Rd

package info (click to toggle)
r-cran-dplyr 1.0.10-1
links: PTS, VCS
area: main
in suites: bookworm
size: 3,476 kB
sloc: cpp: 1,266; sh: 16; makefile: 9
file content (174 lines) | stat: -rw-r--r-- 6,152 bytes
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/across.R
\name{across}
\alias{across}
\alias{if_any}
\alias{if_all}
\title{Apply a function (or functions) across multiple columns}
\usage{
across(.cols = everything(), .fns = NULL, ..., .names = NULL)

if_any(.cols = everything(), .fns = NULL, ..., .names = NULL)

if_all(.cols = everything(), .fns = NULL, ..., .names = NULL)
}
\arguments{
\item{.cols, cols}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> Columns to transform.
Because \code{across()} is used within functions like \code{summarise()} and
\code{mutate()}, you can't select or compute upon grouping variables.}

\item{.fns}{Functions to apply to each of the selected columns.
Possible values are:
\itemize{
\item A function, e.g. \code{mean}.
\item A purrr-style lambda, e.g. \code{~ mean(.x, na.rm = TRUE)}
\item A list of functions/lambdas, e.g.
\verb{list(mean = mean, n_miss = ~ sum(is.na(.x))}
\item \code{NULL}: the default value, returns the selected columns in a data
frame without applying a transformation. This is useful for when you want to
use a function that takes a data frame.
}

Within these functions you can use \code{\link[=cur_column]{cur_column()}} and \code{\link[=cur_group]{cur_group()}}
to access the current column and grouping keys respectively.}

\item{...}{Additional arguments for the function calls in \code{.fns}. Using these
\code{...} is strongly discouraged because of issues of timing of evaluation.}

\item{.names}{A glue specification that describes how to name the output
columns. This can use \code{{.col}} to stand for the selected column name, and
\code{{.fn}} to stand for the name of the function being applied. The default
(\code{NULL}) is equivalent to \code{"{.col}"} for the single function case and
\code{"{.col}_{.fn}"} for the case where a list is used for \code{.fns}.}
}
\value{
\code{across()} returns a tibble with one column for each column in \code{.cols} and each function in \code{.fns}.

\code{if_any()} and \code{if_all()} return a logical vector.
}
\description{
\code{across()} makes it easy to apply the same transformation to multiple
columns, allowing you to use \code{\link[=select]{select()}} semantics inside in "data-masking"
functions like \code{\link[=summarise]{summarise()}} and \code{\link[=mutate]{mutate()}}. See \code{vignette("colwise")} for
more details.

\code{if_any()} and \code{if_all()} apply the same
predicate function to a selection of columns and combine the
results into a single logical vector: \code{if_any()} is \code{TRUE} when
the predicate is \code{TRUE} for \emph{any} of the selected columns, \code{if_all()}
is \code{TRUE} when the predicate is \code{TRUE} for \emph{all} selected columns.

\code{across()} supersedes the family of "scoped variants" like
\code{summarise_at()}, \code{summarise_if()}, and \code{summarise_all()}.
}
\section{Timing of evaluation}{

R code in dplyr verbs is generally evaluated once per group.
Inside \code{across()} however, code is evaluated once for each
combination of columns and groups. If the evaluation timing is
important, for example if you're generating random variables, think
about when it should happen and place your code in consequence.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{gdf <-
  tibble(g = c(1, 1, 2, 3), v1 = 10:13, v2 = 20:23) \%>\%
  group_by(g)

set.seed(1)

# Outside: 1 normal variate
n <- rnorm(1)
gdf \%>\% mutate(across(v1:v2, ~ .x + n))
#> # A tibble: 4 x 3
#> # Groups:   g [3]
#>       g    v1    v2
#>   <dbl> <dbl> <dbl>
#> 1     1  9.37  19.4
#> 2     1 10.4   20.4
#> 3     2 11.4   21.4
#> 4     3 12.4   22.4

# Inside a verb: 3 normal variates (ngroup)
gdf \%>\% mutate(n = rnorm(1), across(v1:v2, ~ .x + n))
#> # A tibble: 4 x 4
#> # Groups:   g [3]
#>       g    v1    v2      n
#>   <dbl> <dbl> <dbl>  <dbl>
#> 1     1  10.2  20.2  0.184
#> 2     1  11.2  21.2  0.184
#> 3     2  11.2  21.2 -0.836
#> 4     3  14.6  24.6  1.60

# Inside `across()`: 6 normal variates (ncol * ngroup)
gdf \%>\% mutate(across(v1:v2, ~ .x + rnorm(1)))
#> # A tibble: 4 x 3
#> # Groups:   g [3]
#>       g    v1    v2
#>   <dbl> <dbl> <dbl>
#> 1     1  10.3  20.7
#> 2     1  11.3  21.7
#> 3     2  11.2  22.6
#> 4     3  13.5  22.7
}\if{html}{\out{</div>}}
}

\examples{
# across() -----------------------------------------------------------------
# Different ways to select the same set of columns
# See <https://tidyselect.r-lib.org/articles/syntax.html> for details
iris \%>\%
  as_tibble() \%>\%
  mutate(across(c(Sepal.Length, Sepal.Width), round))
iris \%>\%
  as_tibble() \%>\%
  mutate(across(c(1, 2), round))
iris \%>\%
  as_tibble() \%>\%
  mutate(across(1:Sepal.Width, round))
iris \%>\%
  as_tibble() \%>\%
  mutate(across(where(is.double) & !c(Petal.Length, Petal.Width), round))

# A purrr-style formula
iris \%>\%
  group_by(Species) \%>\%
  summarise(across(starts_with("Sepal"), ~ mean(.x, na.rm = TRUE)))

# A named list of functions
iris \%>\%
  group_by(Species) \%>\%
  summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd)))

# Use the .names argument to control the output names
iris \%>\%
  group_by(Species) \%>\%
  summarise(across(starts_with("Sepal"), mean, .names = "mean_{.col}"))
iris \%>\%
  group_by(Species) \%>\%
  summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd), .names = "{.col}.{.fn}"))

# When the list is not named, .fn is replaced by the function's position
iris \%>\%
  group_by(Species) \%>\%
  summarise(across(starts_with("Sepal"), list(mean, sd), .names = "{.col}.fn{.fn}"))

# across() returns a data frame, which can be used as input of another function
df <- data.frame(
  x1  = c(1, 2, NA),
  x2  = c(4, NA, 6),
  y   = c("a", "b", "c")
)
df \%>\%
  mutate(x_complete = complete.cases(across(starts_with("x"))))
df \%>\%
  filter(complete.cases(across(starts_with("x"))))

# if_any() and if_all() ----------------------------------------------------
iris \%>\%
  filter(if_any(ends_with("Width"), ~ . > 4))
iris \%>\%
  filter(if_all(ends_with("Width"), ~ . > 2))

}
\seealso{
\code{\link[=c_across]{c_across()}} for a function that returns a vector
}