File: rbind_pages.Rd

package info (click to toggle)
r-cran-jsonlite 1.9.1%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,340 kB
  • sloc: ansic: 3,792; sh: 9; makefile: 6
file content (48 lines) | stat: -rw-r--r-- 1,775 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rbind_pages.R
\name{rbind_pages}
\alias{rbind_pages}
\title{Combine pages into a single data frame}
\usage{
rbind_pages(pages)
}
\arguments{
\item{pages}{a list of data frames, each representing a \emph{page} of data}
}
\description{
The \code{rbind_pages} function is used to combine a list of data frames into a single
data frame. This is often needed when working with a JSON API that limits the amount
of data per request. If we need more data than what fits in a single request, we need to
perform multiple requests that each retrieve a fragment of data, not unlike pages in a
book. In practice this is often implemented using a \code{page} parameter in the API. The
\code{rbind_pages} function can be used to combine these pages back into a single dataset.
}
\details{
The \code{rbind_pages} function uses \code{\link[vctrs:vec_bind]{vctrs::vec_rbind()}}
to bind the pages together. This generalizes \code{\link[base:cbind]{base::rbind()}} in two
ways:
\itemize{
\item Not each column has to be present in each of the individual data frames; missing
columns will be filled up in \code{NA} values.
\item Data frames can be nested (can contain other data frames).
}
}
\examples{
# Basic example
x <- data.frame(foo = rnorm(3), bar = c(TRUE, FALSE, TRUE))
y <- data.frame(foo = rnorm(2), col = c("blue", "red"))
rbind_pages(list(x, y))

\donttest{
baseurl <- "https://projects.propublica.org/nonprofits/api/v2/search.json"
pages <- list()
for(i in 0:20){
  mydata <- fromJSON(paste0(baseurl, "?order=revenue&sort_order=desc&page=", i))
  message("Retrieving page ", i)
  pages[[i+1]] <- mydata$organizations
}
organizations <- rbind_pages(pages)
nrow(organizations)
colnames(organizations)
}
}