library(dplyr)
library(matsbyname)
library(tibble)
matsbyname
functions in which operands are specified in a ...
argument are ambiguous when applied to a data frame. But there is an argument (.summarise
) that signals intention, allowing the ambiguous functions to be used flexibly with data frames.
For normal functions, such as +
and mean()
, there is no ambiguity about their operation in a data frame.
<- tibble::tribble(~x, ~y, ~z,
df 1, 2, 3,
4, 5, 6)
# Typically, operations are done across rows.
%>%
df ::mutate(
dplyra = x + y + z,
b = rowMeans(.)
)#> # A tibble: 2 × 5
#> x y z a b
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 6 2
#> 2 4 5 6 15 5
To perform the same operations down columns, use dplyr::summarise()
.
%>%
df ::summarise(
dplyrx = sum(x),
y = sum(y),
z = sum(z)
)#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 5 7 9
%>%
df ::summarise(
dplyrx = mean(x),
y = mean(y),
z = mean(z)
)#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 2.5 3.5 4.5
matsbyname::sum_byname()
What does matsbyname::sum_byname()
mean for a data frame? Will it give sums across rows (as +
), or will it give sums down columns (as summarise()
)? This ambiguity is present for all *_byname()
functions in which operands are specified via the ...
argument, including matrixproduct_byname()
, hadamardproduct_byname()
, mean_byname()
, etc.
To resolve the ambiguity, use the .summarise
argument. The default value of .summarise
is FALSE
, meaning that the functions normally operate across rows. If you want to perform the action down columns, set .summarise = TRUE
.
%>%
df ::mutate(
dplyra = sum_byname(x, y, z),
b = mean_byname(x, y, z)
)#> # A tibble: 2 × 5
#> x y z a b
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 6 2
#> 2 4 5 6 15 5
%>%
df ::summarise(
dplyrx = sum_byname(x, .summarise = TRUE) %>% unlist(),
y = sum_byname(y, .summarise = TRUE) %>% unlist(),
z = sum_byname(z, .summarise = TRUE) %>% unlist()
)#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 5 7 9
The .summarise
argument broadens the range of applicability for many matsbyname
functions, especially when used with data frames. The default is .summarise = FALSE
, meaning that operations will be performed across columns. Set .summarise = TRUE
argument to signal intent to perform operations down a column.