File: pivot_longer.Rd

package info (click to toggle)
r-cran-tidyr 1.3.1-1
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 2,720 kB
  • sloc: cpp: 268; sh: 9; makefile: 2
file content (169 lines) | stat: -rw-r--r-- 6,767 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pivot-long.R
\name{pivot_longer}
\alias{pivot_longer}
\title{Pivot data from wide to long}
\usage{
pivot_longer(
  data,
  cols,
  ...,
  cols_vary = "fastest",
  names_to = "name",
  names_prefix = NULL,
  names_sep = NULL,
  names_pattern = NULL,
  names_ptypes = NULL,
  names_transform = NULL,
  names_repair = "check_unique",
  values_to = "value",
  values_drop_na = FALSE,
  values_ptypes = NULL,
  values_transform = NULL
)
}
\arguments{
\item{data}{A data frame to pivot.}

\item{cols}{<\code{\link[=tidyr_tidy_select]{tidy-select}}> Columns to pivot into
longer format.}

\item{...}{Additional arguments passed on to methods.}

\item{cols_vary}{When pivoting \code{cols} into longer format, how should the
output rows be arranged relative to their original row number?
\itemize{
\item \code{"fastest"}, the default, keeps individual rows from \code{cols} close
together in the output. This often produces intuitively ordered output
when you have at least one key column from \code{data} that is not involved in
the pivoting process.
\item \code{"slowest"} keeps individual columns from \code{cols} close together in the
output. This often produces intuitively ordered output when you utilize
all of the columns from \code{data} in the pivoting process.
}}

\item{names_to}{A character vector specifying the new column or columns to
create from the information stored in the column names of \code{data} specified
by \code{cols}.
\itemize{
\item If length 0, or if \code{NULL} is supplied, no columns will be created.
\item If length 1, a single column will be created which will contain the
column names specified by \code{cols}.
\item If length >1, multiple columns will be created. In this case, one of
\code{names_sep} or \code{names_pattern} must be supplied to specify how the
column names should be split. There are also two additional character
values you can take advantage of:
\itemize{
\item \code{NA} will discard the corresponding component of the column name.
\item \code{".value"} indicates that the corresponding component of the column
name defines the name of the output column containing the cell values,
overriding \code{values_to} entirely.
}
}}

\item{names_prefix}{A regular expression used to remove matching text
from the start of each variable name.}

\item{names_sep, names_pattern}{If \code{names_to} contains multiple values,
these arguments control how the column name is broken up.

\code{names_sep} takes the same specification as \code{\link[=separate]{separate()}}, and can either
be a numeric vector (specifying positions to break on), or a single string
(specifying a regular expression to split on).

\code{names_pattern} takes the same specification as \code{\link[=extract]{extract()}}, a regular
expression containing matching groups (\verb{()}).

If these arguments do not give you enough control, use
\code{pivot_longer_spec()} to create a spec object and process manually as
needed.}

\item{names_ptypes, values_ptypes}{Optionally, a list of column name-prototype
pairs. Alternatively, a single empty prototype can be supplied, which will
be applied to all columns. A prototype (or ptype for short) is a
zero-length vector (like \code{integer()} or \code{numeric()}) that defines the type,
class, and attributes of a vector. Use these arguments if you want to
confirm that the created columns are the types that you expect. Note that
if you want to change (instead of confirm) the types of specific columns,
you should use \code{names_transform} or \code{values_transform} instead.}

\item{names_transform, values_transform}{Optionally, a list of column
name-function pairs. Alternatively, a single function can be supplied,
which will be applied to all columns. Use these arguments if you need to
change the types of specific columns. For example, \code{names_transform = list(week = as.integer)} would convert a character variable called \code{week}
to an integer.

If not specified, the type of the columns generated from \code{names_to} will
be character, and the type of the variables generated from \code{values_to}
will be the common type of the input columns used to generate them.}

\item{names_repair}{What happens if the output has invalid column names?
The default, \code{"check_unique"} is to error if the columns are duplicated.
Use \code{"minimal"} to allow duplicates in the output, or \code{"unique"} to
de-duplicated by adding numeric suffixes. See \code{\link[vctrs:vec_as_names]{vctrs::vec_as_names()}}
for more options.}

\item{values_to}{A string specifying the name of the column to create
from the data stored in cell values. If \code{names_to} is a character
containing the special \code{.value} sentinel, this value will be ignored,
and the name of the value column will be derived from part of the
existing column names.}

\item{values_drop_na}{If \code{TRUE}, will drop rows that contain only \code{NA}s
in the \code{value_to} column. This effectively converts explicit missing values
to implicit missing values, and should generally be used only when missing
values in \code{data} were created by its structure.}
}
\description{
\code{pivot_longer()} "lengthens" data, increasing the number of rows and
decreasing the number of columns. The inverse transformation is
\code{\link[=pivot_wider]{pivot_wider()}}

Learn more in \code{vignette("pivot")}.
}
\details{
\code{pivot_longer()} is an updated approach to \code{\link[=gather]{gather()}}, designed to be both
simpler to use and to handle more use cases. We recommend you use
\code{pivot_longer()} for new code; \code{gather()} isn't going away but is no longer
under active development.
}
\examples{
# See vignette("pivot") for examples and explanation

# Simplest case where column names are character data
relig_income
relig_income \%>\%
  pivot_longer(!religion, names_to = "income", values_to = "count")

# Slightly more complex case where columns have common prefix,
# and missing missings are structural so should be dropped.
billboard
billboard \%>\%
  pivot_longer(
    cols = starts_with("wk"),
    names_to = "week",
    names_prefix = "wk",
    values_to = "rank",
    values_drop_na = TRUE
  )

# Multiple variables stored in column names
who \%>\% pivot_longer(
  cols = new_sp_m014:newrel_f65,
  names_to = c("diagnosis", "gender", "age"),
  names_pattern = "new_?(.*)_(.)(.*)",
  values_to = "count"
)

# Multiple observations per row. Since all columns are used in the pivoting
# process, we'll use `cols_vary` to keep values from the original columns
# close together in the output.
anscombe
anscombe \%>\%
  pivot_longer(
    everything(),
    cols_vary = "slowest",
    names_to = c(".value", "set"),
    names_pattern = "(.)(.)"
  )
}