Forest plots date back to 1970s and are most frequently seen in meta-analysis, but are in no way restricted to these. The forestplot
package is all about providing these in R. It originated form the ‘rmeta’-package’s forestplot
function and has a part from generating a standard forest plot, a few interesting features:
expression(beta)
fontfamily
, fontface
, cex
, etc) for both summary and regular rows. This can be specified down to the each cell.unit
s‡ Features present int the original rmeta::forestplot
function.
Note: An important difference from the original forestplot
is that the current function interprets xlog as the x-axis being in log-format, i.e. you need to provide the data in the antilog/exp format.
A forest plot is closely connected to text and the ability to customize the text is central.
Below is a basic example from the original forestplot
function that shows how to use a table of text:
library(forestplot)
library(dplyr)
# Cochrane data from the 'rmeta'-package
<- structure(list(mean = c(NA, NA, 0.578, 0.165, 0.246, 0.700, 0.348, 0.139, 1.017, NA, 0.531),
cochrane_from_rmeta lower = c(NA, NA, 0.372, 0.018, 0.072, 0.333, 0.083, 0.016, 0.365, NA, 0.386),
upper = c(NA, NA, 0.898, 1.517, 0.833, 1.474, 1.455, 1.209, 2.831, NA, 0.731)),
.Names = c("mean", "lower", "upper"),
row.names = c(NA, -11L),
class = "data.frame")
<- cbind(c("", "Study", "Auckland", "Block", "Doran", "Gamsu", "Morrison", "Papageorgiou", "Tauesch", NA, "Summary"),
tabletext c("Deaths", "(steroid)", "36", "1", "4", "14", "3", "1", "8", NA, NA),
c("Deaths", "(placebo)", "60", "5", "11", "20", "7", "7", "10", NA, NA),
c("", "OR", "0.58", "0.16", "0.25", "0.70", "0.35", "0.14", "1.02", NA, "0.53"))
%>%
cochrane_from_rmeta forestplot(labeltext = tabletext,
is.summary = c(rep(TRUE, 2), rep(FALSE, 8), TRUE),
clip = c(0.1, 2.5),
xlog = TRUE,
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue"))
dplyr
syntaxAs of version 2.0 the forestplot package is compatible with standard dplyr
syntax. Above is a minor adaptation for the old code using this syntax. If you provide a data.frame
it will assume that the names are mean
, lower
, upper
and labeltext
unless you specify otherwise. Below is perhaps a more natural way of achieving the same as above that most likely better corresponds to a modern work flow.
# Cochrane data from the 'rmeta'-package
<- tibble(mean = c(0.578, 0.165, 0.246, 0.700, 0.348, 0.139, 1.017),
base_data lower = c(0.372, 0.018, 0.072, 0.333, 0.083, 0.016, 0.365),
upper = c(0.898, 1.517, 0.833, 1.474, 1.455, 1.209, 2.831),
study = c("Auckland", "Block", "Doran", "Gamsu", "Morrison", "Papageorgiou", "Tauesch"),
deaths_steroid = c("36", "1", "4", "14", "3", "1", "8"),
deaths_placebo = c("60", "5", "11", "20", "7", "7", "10"),
OR = c("0.58", "0.16", "0.25", "0.70", "0.35", "0.14", "1.02"))
<- tibble(mean = 0.531,
summary lower = 0.386,
upper = 0.731,
study = "Summary",
OR = "0.53",
summary = TRUE)
<- tibble(study = c("", "Study"),
header deaths_steroid = c("Deaths", "(steroid)"),
deaths_placebo = c("Deaths", "(placebo)"),
OR = c("", "OR"),
summary = TRUE)
<- tibble(mean = NA_real_)
empty_row
<- bind_rows(header,
cochrane_output_df
base_data,
empty_row,
summary)
%>%
cochrane_output_df forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR),
is.summary = summary,
clip = c(0.1, 2.5),
xlog = TRUE,
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue"))
The same as above but with lines based on the summary elements and also using a direct call with matrix input instead of relying on dplyr.
%>%
cochrane_output_df forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR),
is.summary = summary,
clip = c(0.1, 2.5),
hrzl_lines = gpar(col = "#444444"),
xlog = TRUE,
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue"))
We can also choose what lines we want by providing a list where the name is the line number affected, in the example below 3rd line and 11th counting the first line to be above the first row (not that there is an empty row before summary):
%>%
cochrane_output_df forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR),
is.summary = summary,
clip = c(0.1, 2.5),
hrzl_lines = list("3" = gpar(lty = 2),
"11" = gpar(lwd = 1, columns = 1:4, col = "#000044")),
xlog = TRUE,
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue",
hrz_lines = "#444444"))
For marking the start/end points it is common to add a vertical line at the end of each whisker. In forestplot you simply specify the vertices
argument:
%>%
cochrane_output_df forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR),
is.summary = summary,
hrzl_lines = list("3" = gpar(lty = 2),
"11" = gpar(lwd = 1, columns = 1:4, col = "#000044")),
clip = c(0.1, 2.5),
xlog = TRUE,
col = fpColors(box = "royalblue",
line = "darkblue",
summary = "royalblue",
hrz_lines = "#444444"),
vertices = TRUE)
You can also choose to have the graph positioned within the text table by specifying the graph.pos
argument:
%>%
cochrane_output_df forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR),
is.summary = summary,
graph.pos = 4,
hrzl_lines = list("3" = gpar(lty = 2),
"11" = gpar(lwd = 1, columns = c(1:3,5), col = "#000044"),
"12" = gpar(lwd = 1, lty = 2, columns = c(1:3,5), col = "#000044")),
clip = c(0.1,2.5),
xlog = TRUE,
col = fpColors(box = "royalblue",line = "darkblue", summary = "royalblue", hrz_lines = "#444444"))
If we present a regression output it is sometimes convenient to have non-ascii letters. We will use my study comparing health related quality of life 1 year after total hip arthroplasties between Sweden and Denmark for this section:
data(dfHRQoL)
<- dfHRQoL %>% mutate(est = sprintf("%.2f", mean), .after = labeltext)
dfHRQoL
<- fpColors(box = "royalblue",line = "darkblue", summary = "royalblue")
clrs <- list(c(NA, dfHRQoL %>% filter(group == "Sweden") %>% pull(labeltext)),
tabletext append(list(expression(beta)), dfHRQoL %>% filter(group == "Sweden") %>% pull(est)))
%>%
dfHRQoL filter(group == "Sweden") %>%
bind_rows(tibble(mean = NA_real_), .) %>%
forestplot(labeltext = tabletext,
col = clrs,
xlab = "EQ-5D index")
Altering fonts may give a completely different feel to the table:
# You can set directly the font to desired value, the next three lines are just for handling MacOs on CRAN
<- "mono"
font if (grepl("Ubuntu", Sys.info()["version"])) {
<- "HersheyGothicEnglish"
font
}%>%
dfHRQoL filter(group == "Sweden") %>%
forestplot(labeltext = c(labeltext, est),
txt_gp = fpTxtGp(label = gpar(fontfamily = font)),
col = clrs,
xlab = "EQ-5D index")
There is also the possibility of being selective in gp-styles:
%>%
dfHRQoL filter(group == "Sweden") %>%
forestplot(labeltext = c(labeltext, est),
txt_gp = fpTxtGp(label = list(gpar(fontfamily = font),
gpar(fontfamily = "",
col = "#660000")),
ticks = gpar(fontfamily = "", cex = 1),
xlab = gpar(fontfamily = font, cex = 1.5)),
col = clrs,
xlab = "EQ-5D index")
Clipping the interval is convenient for uncertain estimates in order to retain the resolution for those of more interest. The clipping simply adds an arrow to the confidence interval, see the bottom estimate below:
%>%
dfHRQoL filter(group == "Sweden") %>%
forestplot(labeltext = c(labeltext, est),
clip = c(-.1, Inf),
col = clrs,
xlab = "EQ-5D index")
You can force the box size to a certain size through the boxsize
argument.
%>%
dfHRQoL filter(group == "Sweden") %>%
forestplot(labeltext = c(labeltext, est),
boxsize = 0.2,
clip = c(-.1, Inf),
col = clrs,
xlab = "EQ-5D index")
If you want to keep the relative sizes you need to provide a wrapper to the draw function that transforms the boxes. Below shows how this is done, also how you combine multiple forestplots into one image:
library(grid)
grid.newpage()
<- unit(4, "pt")
borderWidth <- unit(convertX(unit(1, "npc") - borderWidth, unitTo = "npc", valueOnly = TRUE)/2, "npc")
width pushViewport(viewport(layout = grid.layout(nrow = 1,
ncol = 3,
widths = unit.c(width,
borderWidth,
width))
)
)pushViewport(viewport(layout.pos.row = 1,
layout.pos.col = 1))
%>%
dfHRQoL filter(group == "Sweden") %>%
forestplot(labeltext = c(labeltext, est),
title = "Sweden",
clip = c(-.1, Inf),
col = clrs,
xlab = "EQ-5D index",
new_page = FALSE)
upViewport()
pushViewport(viewport(layout.pos.row = 1,
layout.pos.col = 2))
grid.rect(gp = gpar(fill = "#dddddd", col = "#eeeeee"))
upViewport()
pushViewport(viewport(layout.pos.row = 1,
layout.pos.col = 3))
%>%
dfHRQoL filter(group == "Denmark") %>%
forestplot(labeltext = c(labeltext, est),
title = "Denmark",
clip = c(-.1, Inf),
col = clrs,
xlab = "EQ-5D index",
new_page = FALSE)
upViewport(2)
When combining similar outcomes for the same exposure I’ve found it useful to use multiple bands per row. This efficiently increases the data-ink ratio while making the comparison between the two bands trivial. The first time I’ve used this was in my paper comparing Swedish with Danish patients 1 year after total hip arthroplasty. Here the clipping also becomes obvious as the Danish sample was much smaller, resulting in wider confidence intervals. With the new 2.0 dplyr adapted version we can merge the groups into one table and group
%>%
dfHRQoL group_by(group) %>%
forestplot(clip = c(-.1, 0.075),
shapes_gp = fpShapesGp(box = c("blue", "darkred") %>% lapply(function(x) gpar(fill = x, col = "#555555")),
default = gpar(vertices = TRUE)),
ci.vertices = TRUE,
ci.vertices.height = 0.05,
boxsize = .1,
xlab = "EQ-5D index")
You can choose between a number of different estimate indicators. Using the example above we can set the Danish results to circles.
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
shapes_gp = fpShapesGp(box = c("blue", "darkred") %>% lapply(function(x) gpar(fill = x, col = "#555555")),
default = gpar(vertices = TRUE)),
xlab = "EQ-5D index")
The confidence interval/box drawing functions are fully customizeable. You can write your own function that accepts the parameters: lower_limit, estimate, upper_limit, size, y.offset, clr.line, clr.marker, and lwd.
You can furthermore choose between all available line types through the lty.ci that can also be specified element specific.
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
lty.ci = c(1, 2),
col = fpColors(box = c("blue", "darkred")),
xlab = "EQ-5D index")
Legends are automatically added when using group_by
but we can also control them directly through the legend
argument:
%>%
dfHRQoL group_by(group) %>%
forestplot(legend = c("Swedes", "Danes"),
fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
xlab = "EQ-5D index")
This can be further customized by setting the legend_args
argument using the fpLegend
function:
%>%
dfHRQoL group_by(group) %>%
forestplot(legend = c("Swedes", "Danes"),
legend_args = fpLegend(pos = list(x = .85, y = 0.25),
gp = gpar(col = "#CCCCCC", fill = "#F9F9F9")),
fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
xlab = "EQ-5D index")
If the automated ticks don’t match the desired once it is easy to change these using the xticks argument:
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
xticks = c(-.1, -0.05, 0, .05),
xlab = "EQ-5D index")
By adding a “labels” attribute to the ticks you can tailor the ticks even further, here’s an example the suppresses tick text for every other tick:
<- seq(from = -.1, to = .05, by = 0.025)
xticks <- rep(c(TRUE, FALSE), length.out = length(xticks))
xtlab attr(xticks, "labels") <- xtlab
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
xticks = xticks,
xlab = "EQ-5D index")
Sometimes you have a very tall graph and you want to add helper lines in order to make it easier to see the tick marks. This can be useful in non-inferiority or equivalence studies. You can do this through the grid
argument:
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
grid = TRUE,
xticks = c(-.1, -0.05, 0, .05),
zero = 0,
xlab = "EQ-5D index")
You can easily customize both what grid lines to use and what type they should be by adding the gpar object to a vector:
%>%
dfHRQoL group_by(group) %>%
forestplot(fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI),
boxsize = .25, # We set the box size to better visualize the type
line.margin = .1, # We need to add this to avoid crowding
clip = c(-.125, 0.075),
col = fpColors(box = c("blue", "darkred")),
grid = structure(c(-.1, -.05, .05),
gp = gpar(lty = 2, col = "#CCCCFF")),
xlab = "EQ-5D index")
If you are unfamiliar with the structure call it is equivalent to generating a vector and then setting an attribute, eg:
<- c(-.1, -.05, .05)
grid_arg attr(grid_arg, "gp") <- gpar(lty = 2, col = "#CCCCFF")
identical(grid_arg,
structure(c(-.1, -.05, .05),
gp = gpar(lty = 2, col = "#CCCCFF")))
# Returns TRUE
Ok, that’s it. I hope you find the package forestplot
useful.