The crul
package documentation mostly documents how to
work with any particular function or class, but does not detail how you
would use the package in a more realistic context. This vignette
outlines what we think of as best practices for using crul
in scripts or an R package.
In most cases you’ll only need to import one thing from
crul
: HttpClient
. Add crul to
Imports
in your DESCRIPTION
file, and add an
entry like @importFrom crul HttpClient
somewhere in your
package documentation, for example:
#' Some function
#'
#' @export
#' @importFrom crul HttpClient
#' ...
If you have more than one function that needs to make an HTTP request it’s probably useful to have a function for doing HTTP requests. The following is an example of a function.
<- function(url, path, args = list(), ...) {
xGET <- crul::HttpClient$new(url, opts = list(...))
cli <- cli$get(path = path, query = args)
res $raise_for_status()
res$raise_for_ct_json()
res$parse("UTF-8")
res }
There’s some features to note in the above function:
url
: this really depends on your setup. In some cases
the base URL doesn’t change, so you can remove the url
parameter and define the url in the crul::HttpClient$new()
call.path
: this likely will hold anything after the base
pathargs
: named list of query arguments. the default of
list()
means you can then use the function and not have to
pass args
in cases where no query args are needed....
: it’s called an ellipsis. see example and
discussion below.You can use the function like:
<- xGET("https://httpbin.org", "get", args = list(foo = "bar"))
x # parse the JSON to a list
::fromJSON(x)
jsonlite# more parsing
Because we used an ellipsis, anyone can pass in curl options like:
xGET("https://xxx.org", args = list(foo = "bar"), verbose = TRUE)
Curl options are important for digging into the details of HTTP requests, and go a long way towards users being able to sort out their own problems, and help you diagnose problems as well.
Alternatively, you can just do the HTTP request in your
xGET
function and return the response object - and line by
line, or with another function, parse results as needed.
fauxpas is in Suggests in this package. If you don’t have it installed, no worries, but if you do have it installed, we use fauxpas.
There is not much difference with the default
raise_for_status()
between using fauxpas and not using
it.
However, you can construct your own replacement with fauxpas that gives you more flexibility in how you deal with HTTP status codes.
First, make an HTTP request:
<- HttpClient$new("https://httpbin.org/status/404")
con <- con$get() res
Then use fauxpas::find_error_class
to get the correct R6
error class for the status code, in this case 404
<- fauxpas::find_error_class(res$status_code)$new()
x #> <HTTPNotFound>
#> behavior: stop
#> message_template: {{reason}} (HTTP {{status}})
#> message_template_verbose: {{reason}} (HTTP {{status}}).\n - {{message}}
We can then do one of two things: use $do()
or
$do_verbose()
. $do()
is simpler and gives you
thhe same thing $raise_for_status()
gives, but allows you
to change behavior (stop vs. warning vs. message), and how the message
is formatted. By default we get:
$do(res)
x#> Error: Not Found (HTTP 404)
We can change the template using whisker
templating
$do(res, template = "{{status}}\n --> {{reason}}")
x#> Error: 404
#> --> Not Found
$do_verbose()
gives you a lot more detail about the
status code, possibly more than you want:
$do_verbose(res)
x#> Error: Not Found (HTTP 404).
#> - The server has not found anything matching the Request-URI. No indication
#> is given of whether the condition is temporary or permanent. The 410 (Gone)
#> status code SHOULD be used if the server knows, through some internally configurable
#> mechanism, that an old resource is permanently unavailable and has no forwarding
#> address. This status code is commonly used when the server does not wish to
#> reveal exactly why the request has been refused, or when no other response
#> is applicable.
You can change behavior to either warning
or
message
:
$behavior <- "warning"
x$do(res)
x#> Warning message:
#> Not Found (HTTP 404)
$behavior <- "message"
x$do(res)
x#> Not Found (HTTP 404)
In some failure scenarios it may make sense to retry the same
request. For example, if a 429 “Too many requests” http status is
returned, you can retry the request after a certain amount of time (that
time should be supplied by the server). We suggest using RETRY if you
are in these scenarios. See HttpClient$retry()
for more information.
webmockr is a package for stubbing and setting expectations on HTTP requests. It has support for working with two HTTP request packages: crul and httr.
There are a variety of use cases for webmockr
.
webmockr
allows you to give back exact responses just as
you describe and even fail with certain HTTP conditions. Getting certain
failures to happen with a remote server can sometimes be difficult.webmockr
in a test
suite, although the next section covers vcr
which builds on
top of webmockr
and is ideal for tests.See the book HTTP mocking and testing in R for more.
vcr records and replays HTTP requests. Its main use case is for caching HTTP requests in test suites in R packages. It has support for working with two HTTP request packages: crul and httr.
To use vcr
for testing the setup is pretty easy.
vcr
to Suggests in your DESCRIPTION filetests/testthat/
directory called
helper-yourpackage.R
(or skip if as similar file already
exists). In that file use the following lines to setup your path for
storing cassettes (change path to whatever you want):library("vcr")
invisible(vcr::vcr_configure())
vcr
,
wrap the tests in a vcr::use_cassette()
call like:library(testthat)
test_that("my test", {
::use_cassette("rl_citation", {
vcr<- rl_citation()
aa
expect_is(aa, "character")
expect_match(aa, "IUCN")
expect_match(aa, "www.iucnredlist.org")
}) })
That’s it! Just run your tests are you normally would and any HTTP
requests done by crul
or httr
will be cached
on the first test run then the cached responses used every time
thereafter.
See the book HTTP mocking and testing in R for more.
Let us know if there’s anything else you’d like to see in this document and/or if there’s anything that can be explained better.
Last, the httr package has a similar article on best practices, see https://httr.r-lib.org/articles/api-packages.html