File: cache_rds.Rd

package info (click to toggle)
r-cran-xfun 0.20-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 680 kB
  • sloc: ansic: 242; sh: 22; makefile: 2
file content (133 lines) | stat: -rw-r--r-- 6,287 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cache.R
\name{cache_rds}
\alias{cache_rds}
\title{Cache the value of an R expression to an RDS file}
\usage{
cache_rds(
  expr = { },
  rerun = FALSE,
  file = "cache.rds",
  dir = "cache/",
  hash = NULL,
  clean = getOption("xfun.cache_rds.clean", TRUE),
  ...
)
}
\arguments{
\item{expr}{An R expression.}

\item{rerun}{Whether to delete the RDS file, rerun the expression, and save
the result again (i.e., invalidate the cache if it exists).}

\item{file}{The \emph{base} (see Details) cache filename under the directory
specified by the \code{dir} argument. If not specified and this function is
called inside a code chunk of a \pkg{knitr} document (e.g., an R Markdown
document), the default is the current chunk label plus the extension
\file{.rds}.}

\item{dir}{The path of the RDS file is partially determined by
\code{paste0(dir, file)}. If not specified and the \pkg{knitr} package is
available, the default value of \code{dir} is the \pkg{knitr} chunk option
\code{cache.path} (so if you are compiling a \pkg{knitr} document, you do
not need to provide this \code{dir} argument explicitly), otherwise the
default is \file{cache/}. If you do not want to provide a \code{dir} but
simply a valid path to the \code{file} argument, you may use \code{dir =
""}.}

\item{hash}{A \code{list} object that contributes to the MD5 hash of the
cache filename (see Details). It can also take a special character value
\code{"auto"}. Other types of objects are ignored.}

\item{clean}{Whether to clean up the old cache files automatically when
\code{expr} has changed.}

\item{...}{Other arguments to be passed to \code{\link{saveRDS}()}.}
}
\value{
If the cache file does not exist, run the expression and save the
  result to the file, otherwise read the cache file and return the value.
}
\description{
Save the value of an expression to a cache file (of the RDS format). Next
time the value is loaded from the file if it exists.
}
\details{
Note that the \code{file} argument does not provide the full cache filename.
The actual name of the cache file is of the form \file{BASENAME_HASH.rds},
where \file{BASENAME} is the base name provided via the \file{file} argument
(e.g., if \code{file = 'foo.rds'}, \code{BASENAME} would be \file{foo}), and
\file{HASH} is the MD5 hash (also called the \sQuote{checksum}) calculated
from the R code provided to the \code{expr} argument and the value of the
\code{hash} argument, which means when the code or the \code{hash} argument
changes, the \file{HASH} string may also change, and the old cache will be
invalidated (if it exists). If you want to find the cache file, look for
\file{.rds} files that contain 32 hexadecimal digits (consisting of 0-9 and
a-z) at the end of the filename.

The possible ways to invalidate the cache are: 1) change the code in
\code{expr} argument; 2) delete the cache file manually or automatically
through the argument \code{rerun = TRUE}; and 3) change the value of the
\code{hash} argument. The first two ways should be obvious. For the third
way, it makes it possible to automatically invalidate the cache based on
changes in certain R objects. For example, when you run \code{cache_rds({ x +
y })}, you may want to invalidate the cache to rerun \code{{ x + y }} when
the value of \code{x} or \code{y} has been changed, and you can tell
\code{cache_rds()} to do so by \code{cache_rds({ x + y }, hash = list(x,
y))}. The value of the argument \code{hash} is expected to be a list, but it
can also take a special value, \code{"auto"}, which means
\code{cache_rds(expr)} will try to automatically figure out the global
variables in \code{expr}, return a list of their values, and use this list as
the actual value of \code{hash}. This behavior is most likely to be what you
really want: if the code in \code{expr} uses an external global variable, you
may want to invalidate the cache if the value of the global variable has
changed. Here a \dQuote{global variable} means a variable not created locally
in \code{expr}, e.g., for \code{cache_rds({ x <- 1; x + y })}, \code{x} is a
local variable, and \code{y} is (most likely to be) a global variable, so
changes in \code{y} should invalidate the cache. However, you know your own
code the best. If you want to be completely sure when to invalidate the
cache, you can always provide a list of objects explicitly rather than
relying on \code{hash = "auto"}.

By default (the argument \code{clean = TRUE}), old cache files will be
automatically cleaned up. Sometimes you may want to use \code{clean = FALSE}
(set the R global option \code{options(xfun.cache_rds.clean = FALSE)} if you
want \code{FALSE} to be the default). For example, you may not have decided
which version of code to use, and you can keep the cache of both versions
with \code{clean = FALSE}, so when you switch between the two versions of
code, it will still be fast to run the code.
}
\note{
Changes in the code in the \code{expr} argument do not necessarily
  always invalidate the cache, if the changed code is \code{\link{parse}d} to
  the same expression as the previous version of the code. For example, if
  you have run \code{cache_rds({Sys.sleep(5);1+1})} before, running
  \code{cache_rds({ Sys.sleep( 5 ) ; 1 + 1 })} will use the cache, because
  the two expressions are essentially the same (they only differ in white
  spaces). Usually you can add/delete white spaces or comments to your code
  in \code{expr} without invalidating the cache. See the package vignette
  \code{vignette('xfun', package = 'xfun')} for more examples.

  When this function is called in a code chunk of a \pkg{knitr} document, you
  may not want to provide the filename or directory of the cache file,
  because they have reasonable defaults.

  Side-effects (such as plots or printed output) will not be cached. The
  cache only stores the last value of the expression in \code{expr}.
}
\examples{
f = tempfile()  # the cache file
compute = function(...) {
    res = xfun::cache_rds({
        Sys.sleep(1)
        1:10
    }, file = f, dir = "", ...)
    res
}
compute()  # takes one second
compute()  # returns 1:10 immediately
compute()  # fast again
compute(rerun = TRUE)  # one second to rerun
compute()
file.remove(f)
}