1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560
|
# tidyselect 1.2.1
* Performance improvements (#337, #338, #339, #341)
* `eval_select()` out-of-bounds errors now use the verb "select" rather than
"subset" in the error message for consistency with `dplyr::select()` (#271).
* Fix for CRAN checks.
# tidyselect 1.2.0
## New features
* New `tidyselect_data_proxy()` and `tidyselect_data_has_predicates()`
allows tidyselect to work with custom input types (#242).
* New `eval_relocate()` for moving a selection. This powers `dplyr::relocate()`
(#232).
## Lifecycle changes
* Using `all_of()` outside of a tidyselect context is now deprecated (#269).
In the future it will error to be consistent with `any_of()`.
* Use of `.data` in tidyselect expressions is now deprecated to more cleanly
separate tidy-select from data-masking. Replace `.data$x` with `"x"` and
`.data[[var]]` with `all_of(var)` (#169).
* Use of bare predicates (not wrapped in `where()`) and indirection (without
using `all_of()`) have been formally deprecated (#317).
## Minor improvements and bug fixes
* Selection language:
* `any_of()` generates a more informative error if you supply too many
arguments (#241).
* `all_of()` (like `any_of()`) returns an integer vector to make it easier
to combine in functions (#270, #294). It also fails when it can't find
variables even when `strict = FALSE`.
* `matches()` recognises and correctly uses stringr pattern objects
(`stringr::regex()`, `stringr::fixed()`, etc) (#238). It also now
works with named vectors (#250).
* `num_range()` gains a `suffix` argument (#229).
* `where()` is now exported, like all other select helpers (#201),
and gives more informative errors (#236).
* `eval_select()` with `include` now preserves the order of the variables
if they're present in the selection (#224).
* `eval_select()` always returns a named vector, even when renaming is not
permitted (#220).
* `eval_select()` and `eval_relocate()` gain new `allow_empty` argument which
makes it possible to forbid empty selections with `allow_empty = FALSE` (#252).
* `eval_select(allow_rename = FALSE)` no longer fails with empty
selections (#221, @eutwt) or with predicate functions (#225). It now properly
fails with partial renaming (#305).
* `peek_var()` error now generates hyperlink to docs with recent RStudio (#289).
* `vars_pull()` generates more informative error messages (#234, #258, #318)
and gains `error_call` and `error_arg` arguments.
* Errors produced by tidyselect should now be more informative. Evaluation
errors are now chained, with the child error call is set to the `error_call`
argument of `eval_select()` and `eval_rename()`. We've also improved
backtraces of base errors, and done better at propagating the root
`error_call` to vctrs input checkers.
* `tidyselect_verbosity` is no longer used; deprecation messaging is now
controlled by `lifecycle_verbosity` like all other packages (#317).
# tidyselect 1.1.2
* Fix for CRAN checks.
* Better compatibility with rlang 1.0.0 errors. More to come soon.
# tidyselect 1.1.1
* Fix for CRAN checks.
* tidyselect has been re-licensed as MIT (#217).
# tidyselect 1.1.0
* Predicate functions must now be wrapped with `where()`.
```{r}
iris %>% select(where(is.factor))
```
We made this change to avoid puzzling error messages when a variable
is unexpectedly missing from the data frame and there is a
corresponding function in the environment:
```{r}
# Attempts to invoke `data()` function
data.frame(x = 1) %>% select(data)
```
Now tidyselect will correctly complain about a missing variable
rather than trying to invoke a function.
For compatibility we will support predicate functions starting with
`is` for 1 version.
* `eval_select()` gains an `allow_rename` argument. If set to `FALSE`,
renaming variables with the `c(foo = bar)` syntax is an error.
This is useful to implement purely selective behaviour (#178).
* Fixed issue preventing repeated deprecation messages when
`tidyselect_verbosity` is set to `"verbose"` (#184).
* `any_of()` now preserves the order of the input variables (#186).
* The return value of `eval_select()` is now always named, even when
inputs are constant (#173).
# tidyselect 1.0.0
This is the 1.0.0 release of tidyselect. It features a more solidly
defined and implemented syntax, support for predicate functions, new
boolean operators, and much more.
## Documentation
* New Get started vignette for client packages. Read it with
`vignette("tidyselect")` or at
<https://tidyselect.r-lib.org/articles/tidyselect.html>.
* The definition of the tidyselect language has been consolidated. A
technical description is now available:
<https://tidyselect.r-lib.org/articles/syntax.html>.
## Breaking changes
* Selecting non-column variables with bare names now triggers an
informative message suggesting to use `all_of()` instead. Referring
to contextual objects with a bare name is brittle because it might
be masked by a data frame column. Using `all_of()` is safe (#76).
tidyselect now uses vctrs for validating inputs. These changes may
reveal programming errors that were previously silent. They may also
cause failures if your unit tests make faulty assumptions about the
content of error messages created in tidyselect:
* Out-of-bounds errors are thrown when a name doesn't exist or a
location is too large for the input.
* Logical vectors now fail properly.
* Selected variables now must be unique. It was previously possible to
return duplicate selections in some circumstances.
* The input names can no longer contain `NA` values.
Note that we recommend `testthat::verify_output()` for monitoring
error messages thrown from packages that you don't control. Unlike
`expect_error()`, `verify_output()` does not cause CMD check failures
when error messages have changed. See
<https://www.tidyverse.org/blog/2019/11/testthat-2-3-0/> for more
information.
## Syntax
* The boolean operators can now be used to create selections (#106).
- `!` negates a selection.
- `|` takes the union of two selections.
- `&` takes the intersection of two selections.
These patterns can currently be achieved using `-`, `c()` and
`intersect()` respectively. The boolean operators should be more
intuitive to use.
Many thanks to Irene Steves (@isteves) for suggesting this UI.
* You can now use predicate functions in selection contexts:
```r
iris %>% select(is.factor)
iris %>% select(is.factor | is.numeric)
```
This feature is not available in functions that use the legacy
interface of tidyselect. These need to be updated to use
the new `eval_select()` function instead of `vars_select()`.
* Unary `-` inside nested `c()` is now consistently syntax for set
difference (#130).
* Improved support for named elements. It is now possible to assign
the same name to multiple elements, if the input data structure
doesn't require unique names (i.e. anything but a data frame).
* The selection engine has been rewritten to support a clearer
separation between data-expressions (calls to `:`, `-`, and `c`) and
env-expressions (anything else). This means you can now safely use
expressions of the type:
```r
data %>% select(1:ncol(data))
data %>% pivot_longer(1:ncol(data))
```
Even if the data frame `data` contains a column also named `data`,
the subexpression `ncol(data)` is still correctly evaluated.
The `data:ncol(data)` expression is equivalent to `2:3` because
`data` is looked up in the relevant context without ambiguity:
```r
data <- tibble(foo = 1, data = 2, bar = 3)
data %>% dplyr::select(data:ncol(data))
#> # A tibble: 1 x 2
#> data bar
#> <dbl> <dbl>
#> 1 2 3
```
While this example above is a bit contrived, there are many realistic
cases where these changes make it easier to write safe code:
```{r}
select_from <- function(data, var) {
data %>% dplyr::select({{ var }} : ncol(data))
}
data %>% select_from(data)
#> # A tibble: 1 x 2
#> data bar
#> <dbl> <dbl>
#> 1 2 3
```
## User-facing improvements
* The new selection helpers `all_of()` and `any_of()` are strict
variants of `one_of()`. The former always fails if some variables
are unknown, while the latter does not. `all_of()` is safer to use
when you expect all selected variables to exist. `any_of()` is
useful in other cases, for instance to ensure variables are selected
out:
```
vars <- c("Species", "Genus")
iris %>% dplyr::select(-any_of(vars))
```
Note that `all_of()` and `any_of()` are a bit more conservative in
their function signature than `one_of()`: they do not accept dots.
The equivalent of `one_of("a", "b")` is `all_of(c("a", "b"))`.
* Selection helpers like `all_of()` and `starts_with()` are now
available in all selection contexts, even when they haven't been
attached to the search path. The most visible consequence of this
change is that it is now easier to use selection functions without
attaching the host package:
```r
# Before
dplyr::select(mtcars, dplyr::starts_with("c"))
# After
dplyr::select(mtcars, starts_with("c"))
```
It is still recommended to export the helpers from your package so
that users can easily look up the documentation with `?`.
* `starts_with()`, `ends_with()`, `contains()`, and `matches()` now
accept vector inputs (#50). For instance these are now equivalent
ways of selecting all variables that start with either `"a"` or `"b"`:
```{r}
starts_with(c("a", "b"))
starts_with("a") | starts_with("b")
```
* `matches()` has new argument `perl` to allow for Perl-like regular
expressions (@fmichonneau, #71)
* Better support for selecting with S3 vectors. For instance, factors
are treated as characters.
## API
New `eval_select()` and `eval_rename()` functions for client
packages. These replace `vars_select()` and `vars_rename()`, which are
now deprecated. These functions:
* Take the full data rather than just names. This makes it possible to
use function predicates in selection context.
* Return a numeric vector of locations rather than a vector of
names. This makes it possible to use tidyselect with inputs that
support duplicate names, like regular vectors.
## Other features and fixes
* The `.strict` argument of `vars_select()` now works more robustly
and consistently.
* Using arithmetic operators in selection context now fails more
informatively (#84).
* It is now possible to select columns in data frames containing
duplicate variables (#94). However, the duplicates can't be part of
the final selection.
* `eval_rename()` no longer ignore the names of unquoted character
vectors of length 1 (#79).
* `eval_rename()` now fails when a variable is renamed to an existing
name (#70).
* `eval_rename()` has better support for existing duplicates (but
creating new duplicates is an error).
* `eval_select()`, `eval_rename()` and `vars_pull()` now detect
missing values uniformly (#72).
* `vars_pull()` now includes the faulty expression in error messages.
* The performance issues of `eval_rename()` with many arguments have
been fixed. This make `dplyr::rename_all()` with many columns much
faster (@zkamvar, #92).
* tidyselect is now much faster with many columns, thanks to a
performance fix in `rlang::env_bind()` as well as internal fixes.
* `vars_select()` ignores vectors with only zeros (#82).
# tidyselect 0.2.5
This is a maintenance release for compatibility with rlang 0.3.0.
# tidyselect 0.2.4
* Fixed a warning that occurred when a vector of column positions was
supplied to `vars_select()` or functions depending on it such as
`tidyr::gather()` (#43 and tidyverse/tidyr#374).
* Fixed compatibility issue with rlang 0.2.0 (#51).
# tidyselect 0.2.3
* Internal fixes in prevision of using `tidyselect` within `dplyr`.
* `vars_select()` and `vars_rename()` now correctly support unquoting
character vectors that have names.
* `vars_select()` now ignores missing variables.
# tidyselect 0.2.2
* `dplyr` is now correctly mentioned as suggested package.
# tidyselect 0.2.1
* `-` now supports character vectors in addition to strings. This
makes it easy to unquote column names to exclude from the set:
```{r}
vars <- c("cyl", "am", "disp", "drat")
vars_select(names(mtcars), - !!vars)
```
* `last_col()` now issues an error when the variable vector is empty.
* `last_col()` now returns column positions rather than column names
for consistency with other helpers. This also makes it compatible
with functions like `seq()`.
* `c()` now supports character vectors the same way as `-` and `seq()`.
(#37 @gergness)
# tidyselect 0.2.0
The main point of this release is to revert a troublesome behaviour
introduced in tidyselect 0.1.0. It also includes a few features.
## Evaluation rules
The special evaluation semantics for selection have been changed
back to the old behaviour because the new rules were causing too
much trouble and confusion. From now on data expressions (symbols
and calls to `:` and `c()`) can refer to both registered variables
and to objects from the context.
However the semantics for context expressions (any calls other than
to `:` and `c()`) remain the same. Those expressions are evaluated
in the context only and cannot refer to registered variables.
If you're writing functions and refer to contextual objects, it is
still a good idea to avoid data expressions. Since registered
variables are change as a function of user input and you never know
if your local objects might be shadowed by a variable. Consider:
```
n <- 2
vars_select(letters, 1:n)
```
Should that select up to the second element of `letters` or up to
the 14th? Since the variables have precedence in a data expression,
this will select the 14 first letters. This can be made more robust
by turning the data expression into a context expression:
```
vars_select(letters, seq(1, n))
```
You can also use quasiquotation since unquoted arguments are
guaranteed to be evaluated without any user data in scope. While
equivalent because of the special rules for context expressions,
this may be clearer to the reader accustomed to tidy eval:
```{r}
vars_select(letters, seq(1, !! n))
```
Finally, you may want to be more explicit in the opposite direction.
If you expect a variable to be found in the data but not in the
context, you can use the `.data` pronoun:
```{r}
vars_select(names(mtcars), .data$cyl : .data$drat)
```
## New features
* The new select helper `last_col()` is helpful to select over a
custom range: `vars_select(vars, 3:last_col())`.
* `:` and `-` now handle strings as well. This makes it easy to
unquote a column name: `(!!name) : last_col()` or `- !!name`.
* `vars_select()` gains a `.strict` argument similar to
`rename_vars()`. If set to `FALSE`, errors about unknown variables
are ignored.
* `vars_select()` now treats `NULL` as empty inputs. This follows a
trend in the tidyverse tools.
* `vars_rename()` now handles variable positions (integers or round
doubles) just like `vars_select()` (#20).
* `vars_rename()` is now implemented with the tidy eval framework.
Like `vars_select()`, expressions are evaluated without any user
data in scope. In addition a variable context is now established so
you can write rename helpers. Those should return a single round
number or a string (variable position or variable name).
* `has_vars()` is a predicate that tests whether a variable context
has been set (#21).
* The selection helpers are now exported in a list
`vars_select_helpers`. This is intended for APIs that embed the
helpers in the evaluation environment.
## Fixes
* `one_of()` argument `vars` has been renamed to `.vars` to avoid
spurious matching.
# tidyselect 0.1.1
tidyselect is the new home for the legacy functions
`dplyr::select_vars()`, `dplyr::rename_vars()` and
`dplyr::select_var()`.
## API changes
We took this opportunity to make a few changes to the API:
* `select_vars()` and `rename_vars()` are now `vars_select()` and
`vars_rename()`. This follows the tidyverse convention that a prefix
corresponds to the input type while suffixes indicate the output
type. Similarly, `select_var()` is now `vars_pull()`.
* The arguments are now prefixed with dots to limit argument matching
issues. While the dots help, it is still a good idea to splice a
list of captured quosures to make sure dotted arguments are never
matched to `vars_select()`'s named arguments:
```
vars_select(vars, !!! quos(...))
```
* Error messages can now be customised. For consistency with dplyr,
error messages refer to "columns" by default. This assumes that the
variables being selected come from a data frame. If this is not
appropriate for your DSL, you can now add an attribute `vars_type`
to the `.vars` vector to specify alternative names. This must be a
character vector of length 2 whose first component is the singular
form and the second is the plural. For example, `c("variable",
"variables")`.
## Establishing a variable context
tidyselect provides a few more ways of establishing a variable
context:
* `scoped_vars()` sets up a variable context along with an an exit
hook that automatically restores the previous variables. It is the
preferred way of changing the variable context.
`with_vars()` takes variables and an expression and evaluates the
latter in the context of the former.
* `poke_vars()` establishes a new variable context. It returns the
previous context invisibly and it is your responsibility to restore
it after you are done. This is for expert use only.
`current_vars()` has been renamed to `peek_vars()`. This naming is a
reference to [peek and poke](https://en.wikipedia.org/wiki/PEEK_and_POKE)
from legacy languages.
## New evaluation semantics
The evaluation semantics for selecting verbs have changed. Symbols are
now evaluated in a data-only context that is isolated from the calling
environment. This means that you can no longer refer to local variables
unless you are explicitly unquoting these variables with `!!`, which
is mostly for expert use.
Note that since dplyr 0.7, helper calls (like `starts_with()`) obey
the opposite behaviour and are evaluated in the calling context
isolated from the data context. To sum up, symbols can only refer to
data frame objects, while helpers can only refer to contextual
objects. This differs from usual R evaluation semantics where both
the data and the calling environment are in scope (with the former
prevailing over the latter).
|