Add caching with reactivity to an object — bindCache
R/bind-cache.R
Description
bindCache()
adds caching reactive()
expressions and render*
functions
(like renderText()
, renderTable()
, ...).
Ordinary reactive()
expressions automatically cache their most recent
value, which helps to avoid redundant computation in downstream reactives.
bindCache()
will cache all previous values (as long as they fit in the
cache) and they can be shared across user sessions. This allows
bindCache()
to dramatically improve performance when used correctly.
Arguments
- x
The object to add caching to.
- ...
One or more expressions to use in the caching key.
- cache
The scope of the cache, or a cache object. This can be
"app"
(the default),"session"
, or a cache object like acachem::cache_disk()
. See the Cache Scoping section for more information.
Details
bindCache()
requires one or more expressions that are used to generate a
cache key, which is used to determine if a computation has occurred
before and hence can be retrieved from the cache. If you're familiar with the
concept of memoizing pure functions (e.g., the memoise package), you
can think of the cache key as the input(s) to a pure function. As such, one
should take care to make sure the use of bindCache()
is pure in the same
sense, namely:
For a given key, the return value is always the same.
Evaluation has no side-effects.
In the example here, the bindCache()
key consists of input$x
and
input$y
combined, and the value is input$x * input$y
. In this simple
example, for any given key, there is only one possible returned value.
The largest performance improvements occur when the cache key is fast to compute and the reactive expression is slow to compute. To see if the value should be computed, a cached reactive evaluates the key, and then serializes and hashes the result. If the resulting hashed key is in the cache, then the cached reactive simply retrieves the previously calculated value and returns it; if not, then the value is computed and the result is stored in the cache before being returned.
To compute the cache key, bindCache()
hashes the contents of ...
, so it's
best to avoid including large objects in a cache key since that can result in
slow hashing. It's also best to avoid reference objects like environments and
R6 objects, since the serialization of these objects may not capture relevant
changes.
If you want to use a large object as part of a cache key, it may make sense to do some sort of reduction on the data that still captures information about whether a value can be retrieved from the cache. For example, if you have a large data set with timestamps, it might make sense to extract the most recent timestamp and return that. Then, instead of hashing the entire data object, the cached reactive only needs to hash the timestamp.
For computations that are very slow, it often makes sense to pair
bindCache()
with bindEvent()
so that no computation is performed until
the user explicitly requests it (for more, see the Details section of
bindEvent()
).
Cache keys and reactivity
Because the value expression (from the original reactive()
) is
cached, it is not necessarily re-executed when someone retrieves a value,
and therefore it can't be used to decide what objects to take reactive
dependencies on. Instead, the key is used to figure out which objects
to take reactive dependencies on. In short, the key expression is reactive,
and value expression is no longer reactive.
Here's an example of what not to do: if the key is input$x
and the value
expression is from reactive({input$x + input$y})
, then the resulting
cached reactive will only take a reactive dependency on input$x
-- it
won't recompute {input$x + input$y}
when just input$y
changes.
Moreover, the cache won't use input$y
as part of the key, and so it could
return incorrect values in the future when it retrieves values from the
cache. (See the examples below for an example of this.)
A better cache key would be something like input$x, input$y
. This does
two things: it ensures that a reactive dependency is taken on both
input$x
and input$y
, and it also makes sure that both values are
represented in the cache key.
In general, key
should use the same reactive inputs as value
, but the
computation should be simpler. If there are other (non-reactive) values
that are consumed, such as external data sources, they should be used in
the key
as well. Note that if the key
is large, it can make sense to do
some sort of reduction on it so that the serialization and hashing of the
cache key is not too expensive.
Remember that the key is reactive, so it is not re-executed every single time that someone accesses the cached reactive. It is only re-executed if it has been invalidated by one of the reactives it depends on. For example, suppose we have this cached reactive:
In this case, the key expression is essentially reactive(list(input$x, input$y))
(there's a bit more to it, but that's a good enough
approximation). The first time r()
is called, it executes the key, then
fails to find it in the cache, so it executes the value expression, { input$x + input$y }
. If r()
is called again, then it does not need to
re-execute the key expression, because it has not been invalidated via a
change to input$x
or input$y
; it simply returns the previous value.
However, if input$x
or input$y
changes, then the reactive expression will
be invalidated, and the next time that someone calls r()
, the key
expression will need to be re-executed.
Note that if the cached reactive is passed to bindEvent()
, then the key
expression will no longer be reactive; instead, the event expression will be
reactive.
Cache scope
By default, when bindCache()
is used, it is scoped to the running
application. That means that it shares a cache with all user sessions
connected to the application (within the R process). This is done with the
cache
parameter's default value, "app"
.
With an app-level cache scope, one user can benefit from the work done for another user's session. In most cases, this is the best way to get performance improvements from caching. However, in some cases, this could leak information between sessions. For example, if the cache key does not fully encompass the inputs used by the value, then data could leak between the sessions. Or if a user sees that a cached reactive returns its value very quickly, they may be able to infer that someone else has already used it with the same values.
It is also possible to scope the cache to the session, with
cache="session"
. This removes the risk of information leaking between
sessions, but then one session cannot benefit from computations performed in
another session.
It is possible to pass in caching objects directly to
bindCache()
. This can be useful if, for example, you want to use a
particular type of cache with specific cached reactives, or if you want to
use a cachem::cache_disk()
that is shared across multiple processes and
persists beyond the current R session.
To use different settings for an application-scoped cache, you can call
shinyOptions()
at the top of your app.R, server.R, or
global.R. For example, this will create a cache with 500 MB of space
instead of the default 200 MB:
shinyOptions(cache = cachem::cache_mem(max_size = 500e6))
To use different settings for a session-scoped cache, you can set
self$cache
at the top of your server function. By default, it will create
a 200 MB memory cache for each session, but you can replace it with
something different. To use the session-scoped cache, you must also call
bindCache()
with cache="session"
. This will create a 100 MB cache for
the session:
function(input, output, session) {
session$cache <- cachem::cache_mem(max_size = 100e6)
...
}
If you want to use a cache that is shared across multiple R processes, you
can use a cachem::cache_disk()
. You can create a application-level shared
cache by putting this at the top of your app.R, server.R, or global.R:
shinyOptions(cache = cachem::cache_disk(file.path(dirname(tempdir()), "myapp-cache"))
This will create a subdirectory in your system temp directory named
myapp-cache
(replace myapp-cache
with a unique name of
your choosing). On most platforms, this directory will be removed when
your system reboots. This cache will persist across multiple starts and
stops of the R process, as long as you do not reboot.
To have the cache persist even across multiple reboots, you can create the cache in a location outside of the temp directory. For example, it could be a subdirectory of the application:
shinyOptions(cache = cachem::cache_disk("./myapp-cache"))
In this case, resetting the cache will have to be done manually, by deleting the directory.
You can also scope a cache to just one item, or selected items. To do that,
create a cachem::cache_mem()
or cachem::cache_disk()
, and pass it
as the cache
argument of bindCache()
.
Computing cache keys
The actual cache key that is used internally takes value from evaluating
the key expression(s) (from the ...
arguments) and combines it with the
(unevaluated) value expression.
This means that if there are two cached reactives which have the same result from evaluating the key, but different value expressions, then they will not need to worry about collisions.
However, if two cached reactives have identical key and value expressions
expressions, they will share the cached values. This is useful when using
cache="app"
: there may be multiple user sessions which create separate
cached reactive objects (because they are created from the same code in the
server function, but the server function is executed once for each user
session), and those cached reactive objects across sessions can share
values in the cache.
Async with cached reactives
With a cached reactive expression, the key and/or value expression can be
asynchronous. In other words, they can be promises --- not regular R
promises, but rather objects provided by the
promises package, which
are similar to promises in JavaScript. (See promises::promise()
for more
information.) You can also use future::future()
objects to run code in a
separate process or even on a remote machine.
If the value returns a promise, then anything that consumes the cached reactive must expect it to return a promise.
Similarly, if the key is a promise (in other words, if it is asynchronous), then the entire cached reactive must be asynchronous, since the key must be computed asynchronously before it knows whether to compute the value or the value is retrieved from the cache. Anything that consumes the cached reactive must therefore expect it to return a promise.
Developing render functions for caching
If you've implemented your own render*()
function, it may just work with
bindCache()
, but it is possible that you will need to make some
modifications. These modifications involve helping bindCache()
avoid
cache collisions, dealing with internal state that may be set by the,
render
function, and modifying the data as it goes in and comes out of
the cache.
You may need to provide a cacheHint
to createRenderFunction()
(or
htmlwidgets::shinyRenderWidget()
, if you've authored an htmlwidget) in
order for bindCache()
to correctly compute a cache key.
The potential problem is a cache collision. Consider the following:
output$x1 <- renderText({ input$x }) %>% bindCache(input$x)
output$x2 <- renderText({ input$x * 2 }) %>% bindCache(input$x)
Both output$x1
and output$x2
use input$x
as part of their cache key,
but if it were the only thing used in the cache key, then the two outputs
would have a cache collision, and they would have the same output. To avoid
this, a cache hint is automatically added when renderText()
calls
createRenderFunction()
. The cache hint is used as part of the actual
cache key, in addition to the one passed to bindCache()
by the user. The
cache hint can be viewed by calling the internal Shiny function
extractCacheHint()
:
This returns a nested list containing an item, $origUserFunc$body
, which
in this case is the expression which was passed to renderText()
:
{ input$x }
. This (quoted) expression is mixed into the actual cache
key, and it is how output$x1
does not have collisions with output$x2
.
For most developers of render functions, nothing extra needs to be done;
the automatic inference of the cache hint is sufficient. Again, you can
check it by calling shiny:::extractCacheHint()
, and by testing the
render function for cache collisions in a real application.
In some cases, however, the automatic cache hint inference is not
sufficient, and it is necessary to provide a cache hint. This is true
for renderPrint()
. Unlike renderText()
, it wraps the user-provided
expression in another function, before passing it to createRenderFunction()
(instead of createRenderFunction()
). Because the user code is wrapped in
another function, createRenderFunction()
is not able to automatically
extract the user-provided code and use it in the cache key. Instead,
renderPrint
calls createRenderFunction()
, it explicitly passes along a
cacheHint
, which includes a label and the original user expression.
In general, if you need to provide a cacheHint
, it is best practice to
provide a label
id, the user's expr
, as well as any other arguments
that may influence the final value.
For htmlwidgets, it will try to automatically infer a cache hint;
again, you can inspect the cache hint with shiny:::extractCacheHint()
and
also test it in an application. If you do need to explicitly provide a
cache hint, pass it to shinyRenderWidget
. For example:
renderMyWidget <- function(expr) {
q <- rlang::enquo0(expr)
htmlwidgets::shinyRenderWidget(
q,
myWidgetOutput,
quoted = TRUE,
cacheHint = list(label = "myWidget", userQuo = q)
)
}
If your render
function sets any internal state, you may find it useful
in your call to createRenderFunction()
to use
the cacheWriteHook
and/or cacheReadHook
parameters. These hooks are
functions that run just before the object is stored in the cache, and just
after the object is retrieved from the cache. They can modify the data
that is stored and retrieved; this can be useful if extra information needs
to be stored in the cache. They can also be used to modify the state of the
application; for example, it can call createWebDependency()
to make
JS/CSS resources available if the cached object is loaded in a different R
process. (See the source of htmlwidgets::shinyRenderWidget
for an example
of this.)
Uncacheable objects
Some render functions cannot be cached, typically because they have side effects or modify some external state, and they must re-execute each time in order to work properly.
For developers of such code, they should call createRenderFunction()
(or
markRenderFunction()
) with cacheHint = FALSE
.
Caching with renderPlot()
When bindCache()
is used with renderPlot()
, the height
and width
passed to the original renderPlot()
are ignored. They are superseded by
sizePolicy
argument passed to `bindCache. The default is:
sizePolicy = sizeGrowthRatio(width = 400, height = 400, growthRate = 1.2)
sizePolicy
must be a function that takes a two-element numeric vector as
input, representing the width and height of the <img>
element in the
browser window, and it must return a two-element numeric vector, representing
the pixel dimensions of the plot to generate. The purpose is to round the
actual pixel dimensions from the browser to some other dimensions, so that
this will not generate and cache images of every possible pixel dimension.
See sizeGrowthRatio()
for more information on the default sizing policy.
See also
bindEvent()
, renderCachedPlot()
for caching plots.
Examples
if (FALSE) {
rc <- bindCache(
x = reactive({
Sys.sleep(2) # Pretend this is expensive
input$x * 100
}),
input$x
)
# Can make it prettier with the %>% operator
library(magrittr)
rc <- reactive({
Sys.sleep(2)
input$x * 100
}) %>%
bindCache(input$x)
}
## Only run app examples in interactive R sessions
if (interactive()) {
# Basic example
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
r <- reactive({
# The value expression is an _expensive_ computation
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) %>%
bindCache(input$x, input$y)
output$txt <- renderText(r())
}
)
# Caching renderText
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
output$txt <- renderText({
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) %>%
bindCache(input$x, input$y)
}
)
# Demo of using events and caching with an actionButton
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
actionButton("go", "Go"),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
r <- reactive({
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) %>%
bindCache(input$x, input$y) %>%
bindEvent(input$go)
# The cached, eventified reactive takes a reactive dependency on
# input$go, but doesn't use it for the cache key. It uses input$x and
# input$y for the cache key, but doesn't take a reactive depdency on
# them, because the reactive dependency is superseded by addEvent().
output$txt <- renderText(r())
}
)
}