Using alternate database backends for Cache

Alex M. Chubaty

Eliot J. B. McIntire

August 16 2022

1 Cache database backends

By default, caching relies on a sqlite database for it’s backend. While this works in many situations, there are some important limitations of using sqlite for caching, including 1) speed; 2) concurrent transactions; 3) sharing database across machines or projects. Fortunately, Cache makes use of DBI package and thus supports several database backends, including mysql and postgresql. This vignette demonstrates how to use alternate database backends for caching.

2 Storing database credentials

Be careful not to save database credentials in version control. We recommend per-project credentials saved either in config files or environment variables, described below.

See https://db.rstudio.com/best-practices/managing-credentials/ for other important best practices.

3 PostgreSQL

3.1 Using config files

Add the following to your project’s config.yml file:

default:
  cachedb:
    driver: 'postgres' 
    server: 'localhost'
    user: 'mydbuser'
    password: 'mysecurepassword'  
    port: 5432
    database: 'mydatabase'

Then, in your main project script, add the following:

cachedb <- config::get("cachedb")

conn <- DBI::dbConnect(drv = RPostgres::Postgres(),
                      host = cachedb$server,
                      port = cachedb$port,
                      dbname = cachedb$database,
                      user = cachedb$user,
                      password = cachedb$password)

options("reproducible.conn" = conn) # sets the default connection globally

3.2 Using environment variables

Add the following to your project’s .Renviron file:

PGHOST="localhost"
PGPORT=5432
PGDATABASE="mydatabase"
PGUSER="mydbuser"
PGPASSWORD="mysecurepassword"

Then, in your main project script, add the following:

readRenviron(".Renviron") ## alternatively, use global ~/.Renviron

conn <-   DBI::dbConnect(drv = RPostgres::Postgres(),
                         host = Sys.getenv("PGHOST"),
                         port = Sys.getenv("PGPORT"),
                         dbname = Sys.getenv("PGDATABASE"),
                         user = Sys.getenv("PGUSER"),
                         password = Sys.getenv("PGPASSWORD"))

options("reproducible.conn" = conn) # sets the default connection globally