File: extract.Rd

package info (click to toggle)

r-cran-tidyr 1.3.1-1

links: PTS, VCS
area: main
in suites: sid, trixie
size: 2,720 kB
sloc: cpp: 268; sh: 9; makefile: 2

file content (69 lines) | stat: -rw-r--r-- 2,190 bytes

parent folder | download | duplicates (2)

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/extract.R
\name{extract}
\alias{extract}
\title{Extract a character column into multiple columns using regular
expression groups}
\usage{
extract(
  data,
  col,
  into,
  regex = "([[:alnum:]]+)",
  remove = TRUE,
  convert = FALSE,
  ...
)
}
\arguments{
\item{data}{A data frame.}

\item{col}{<\code{\link[=tidyr_tidy_select]{tidy-select}}> Column to expand.}

\item{into}{Names of new variables to create as character vector.
Use \code{NA} to omit the variable in the output.}

\item{regex}{A string representing a regular expression used to extract the
desired values. There should be one group (defined by \verb{()}) for each
element of \code{into}.}

\item{remove}{If \code{TRUE}, remove input column from output data frame.}

\item{convert}{If \code{TRUE}, will run \code{\link[=type.convert]{type.convert()}} with
\code{as.is = TRUE} on new columns. This is useful if the component
columns are integer, numeric or logical.

NB: this will cause string \code{"NA"}s to be converted to \code{NA}s.}

\item{...}{Additional arguments passed on to methods.}
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#superseded}{\figure{lifecycle-superseded.svg}{options: alt='[Superseded]'}}}{\strong{[Superseded]}}

\code{extract()} has been superseded in favour of \code{\link[=separate_wider_regex]{separate_wider_regex()}}
because it has a more polished API and better handling of problems.
Superseded functions will not go away, but will only receive critical bug
fixes.

Given a regular expression with capturing groups, \code{extract()} turns
each group into a new column. If the groups don't match, or the input
is NA, the output will be NA.
}
\examples{
df <- tibble(x = c(NA, "a-b", "a-d", "b-c", "d-e"))
df \%>\% extract(x, "A")
df \%>\% extract(x, c("A", "B"), "([[:alnum:]]+)-([[:alnum:]]+)")

# Now recommended
df \%>\%
  separate_wider_regex(
    x,
    patterns = c(A = "[[:alnum:]]+", "-", B = "[[:alnum:]]+")
  )

# If no match, NA:
df \%>\% extract(x, c("A", "B"), "([a-d]+)-([a-d]+)")
}
\seealso{
\code{\link[=separate]{separate()}} to split up by a separator.
}