Dates, times and timezones can be frustrating, especially when working with environmental time series such as those collected by air and water quality sensors.
Environmental time series data often have a strong diurnal signal and are typically plotted with a time axis displaying local time. However, when data are aggregated into larger collections, it is typical to store data with a universal time axis – UTC.
Problems can arise when parsing and formatting dates and times
because R defaults to the system timezone available with
Sys.timezone()
. Imagine an agency scientist based in
Washington, DC, using their laptop to display recent air quality data
from Los Angeles while at a conference in Tasmania. The data center
processing the data might be in Boulder but the data processing machine
might be set to use UTC. Potential timezones (available with
OlsonNames()
) relevant to this scenario include:
America/New_York
America/Los_Angeles
Australia/Tasmania
America/Denver
UTC
Which timezone should be used to convert a request for data from
“2019-08-08”” to “2018-08-15”” into POSIXct
datetimes?
To enforce specification of timezones and to help with the common user interface need to specify a range of dates or times, the MazamaCoreUtils package provides the following functions:
dateRange()
– parses and returns POSIXct
start and end dates representing full days in the specified
timezonetimeRange()
– parses and returns POSixct
start and end times in the specified timezoneparseDatetime()
– parses and returns a vector of
POSIXct
values in the specified timezoneThe parseDatetime()
function is intended as a
timezone-requiring replacement for
lubridate::parse_date_time()
.
Enforcing the specification of timezones throughout a body of code is the most robust way to remove timezone-related errors from your software. To help with this thistype of code review, the package also includes functions for testing whether specific named arguments are used with certain function calls:
lintFunctionArgs_file()
– check a single filelintFunctionArgs_dir()
– check an entire directoryTo use these functions you must define a set of
function:argument
rules to be applied such as:
timezoneLintRules <- list(
"parse_date_time" = "tz",
"with_tz" = "tzone",
"now" = "tzone",
"strftime" = "tz"
)
This is interpreted as:
parse_date_time()
function must use
the tz
argument explicitly.with_tz()
function must use the
tzone
argument explicitlyWhile these functions could be used to test for explicit use in any
function:argument
pair, our concern here is primarily with
specification of timezones. The packages includes a detailed list of
timezoneLintRules
to help with this. As an example, here is
the result of linting the dateRange.R
function in this
package:
> lintFunctionArgs_file("R/dateRange.R", timezoneLintRules)
# A tibble: 7 x 6
file line_number column_number function_name named_args includes_required
<chr> <int> <int> <chr> <list> <lgl>
1 dateRange.R 125 29 with_tz <chr [1]> TRUE
2 dateRange.R 128 27 with_tz <chr [1]> TRUE
3 dateRange.R 141 18 parse_date_time <chr [2]> TRUE
4 dateRange.R 142 18 parse_date_time <chr [2]> TRUE
5 dateRange.R 159 18 parse_date_time <chr [2]> TRUE
6 dateRange.R 176 18 parse_date_time <chr [2]> TRUE
7 dateRange.R 188 18 now <chr [1]> TRUE
The result shows that the dateRange.R
source code is
consistent in always explicitly specifying a timezone.
Hopefully, this attention to timezones will help your code avoid misunderstandings when it comes to date and time requests.