1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
|
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/complete.R
\name{complete}
\alias{complete}
\title{Complete a data frame with missing combinations of data}
\usage{
complete(data, ..., fill = list(), explicit = TRUE)
}
\arguments{
\item{data}{A data frame.}
\item{...}{<\code{\link[=tidyr_data_masking]{data-masking}}> Specification of columns
to expand or complete. Columns can be atomic vectors or lists.
\itemize{
\item To find all unique combinations of \code{x}, \code{y} and \code{z}, including those not
present in the data, supply each variable as a separate argument:
\code{expand(df, x, y, z)} or \code{complete(df, x, y, z)}.
\item To find only the combinations that occur in the
data, use \code{nesting}: \code{expand(df, nesting(x, y, z))}.
\item You can combine the two forms. For example,
\code{expand(df, nesting(school_id, student_id), date)} would produce
a row for each present school-student combination for all possible
dates.
}
When used with factors, \code{\link[=expand]{expand()}} and \code{\link[=complete]{complete()}} use the full set of
levels, not just those that appear in the data. If you want to use only the
values seen in the data, use \code{forcats::fct_drop()}.
When used with continuous variables, you may need to fill in values
that do not appear in the data: to do so use expressions like
\code{year = 2010:2020} or \code{year = full_seq(year,1)}.}
\item{fill}{A named list that for each variable supplies a single value to
use instead of \code{NA} for missing combinations.}
\item{explicit}{Should both implicit (newly created) and explicit
(pre-existing) missing values be filled by \code{fill}? By default, this is
\code{TRUE}, but if set to \code{FALSE} this will limit the fill to only implicit
missing values.}
}
\description{
Turns implicit missing values into explicit missing values. This is a wrapper
around \code{\link[=expand]{expand()}}, \code{\link[dplyr:mutate-joins]{dplyr::full_join()}} and \code{\link[=replace_na]{replace_na()}} that's useful for
completing missing combinations of data.
}
\section{Grouped data frames}{
With grouped data frames created by \code{\link[dplyr:group_by]{dplyr::group_by()}}, \code{complete()}
operates \emph{within} each group. Because of this, you cannot complete a grouping
column.
}
\examples{
df <- tibble(
group = c(1:2, 1, 2),
item_id = c(1:2, 2, 3),
item_name = c("a", "a", "b", "b"),
value1 = c(1, NA, 3, 4),
value2 = 4:7
)
df
# Combinations --------------------------------------------------------------
# Generate all possible combinations of `group`, `item_id`, and `item_name`
# (whether or not they appear in the data)
df \%>\% complete(group, item_id, item_name)
# Cross all possible `group` values with the unique pairs of
# `(item_id, item_name)` that already exist in the data
df \%>\% complete(group, nesting(item_id, item_name))
# Within each `group`, generate all possible combinations of
# `item_id` and `item_name` that occur in that group
df \%>\%
dplyr::group_by(group) \%>\%
complete(item_id, item_name)
# Supplying values for new rows ---------------------------------------------
# Use `fill` to replace NAs with some value. By default, affects both new
# (implicit) and pre-existing (explicit) missing values.
df \%>\%
complete(
group,
nesting(item_id, item_name),
fill = list(value1 = 0, value2 = 99)
)
# Limit the fill to only the newly created (i.e. previously implicit)
# missing values with `explicit = FALSE`
df \%>\%
complete(
group,
nesting(item_id, item_name),
fill = list(value1 = 0, value2 = 99),
explicit = FALSE
)
}
|