This vignette is designed to highlight some basic aspects of using
the transition pairing method (TPM) in R. It has been written with the
assumption that the reader is already familiar with the TPM itself, and is
simply in need of direction on how to implement it. The core functions
are PAutilities::get_transition_info
and
PAutilities::spurious_curve
, along with S3 methods for
plot
and summary
, and S4 methods for
+
and -
.
The TPM is used for evaluating dynamic segmentation algorithms, especially in the context of physical activity data from wearable sensors. The best example is an algorithm that aims to identify transitions between different activities. Throughout this vignette, we will suppose we are evaluating such an algorithm.
Suppose we have two indicator vectors representing the occurrence of a transition (1) or non-transition (0) at certain time points. One vector represents a criterion measure (e.g. direct observation), and the other represents our reference measure (i.e., the aforementioned transition-detection algorithm).
set.seed(100)
<- (sample(1:100)%%2)
algorithm <- (sample(1:100)%%2) criterion
Ideally, the vectors have equal length. If not, the shorter vector will be expanded (with a warning), such that it has the same length as the longer vector, with transitions placed proportionally to the original. For example, suppose we have a 10-item vector with one transition:
{0, 0, 0, 1, 0, 0, 0, 0, 0, 0}
The transition occurs at index 4 of 10, or 40% of the way into the vector. If we expand the vector to size 25, it will become:
{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
The transition now occurs at index 10 of 25, still 40% of the way into the vector.
We will first run the TPM using a spurious pairing threshold of 7.
NOTE: In both get_transition_info
and
spurious_curve
, the spurious pairing threshold values
reflect indices, not necessarily seconds. In the example below,
the threshold setting (window_size = 7
) would reflect
seconds if both the predictions
and references
objects were second-by-second variables. On the other hand, if they were
5-s epochs (i.e., 0.2 Hz sampling rate), window_size = 7
would correspond to a 35-s lag time allowance.
<- get_transition_info(
transitions predictions = algorithm,
references = criterion,
window_size = 7
)
This gives us an object of class transition
. We can
first plot it.
plot(transitions)
The visualization can be nice for several purposes, not least of which is seeing which pairings (if any) were made non-sequentially and rejected by the pruning algorithm. (Non-sequential pairings are shown with red pairing lines.)
To obtain performance metrics, we can summarize the
transitions
object.
<- summary(transitions) summarized1
This gives an S4 object of class summaryTransition
. The
result
slot must be accessed to obtain the metrics.
@result
summarized1#> window_size reference_positives predicted_positives true_positives
#> 1 7 50 50 35
#> n_rejected_pairs mean_abs_lag_indices sd_abs_lag_indices
#> 1 9 0.3714286 0.598317
#> mean_sd_abs_lag_indices mean_signed_lag_indices sd_signed_lag_indices
#> 1 0.4 ± 0.6 0.2571429 0.6572159
#> mean_sd_signed_lag_indices recall precision rmse_lag_indices rmse_prop
#> 1 0.3 ± 0.7 0.7 0.7 0.7 0.9
#> aggregated_performance
#> 1 0.7666667
# or:
# slot(summarized1, "result")
NOTE: One or more versions prior to
PAutilities 1.0.0
included the Rand
Index (in various forms) in the output. The associated package
is no longer supported, so this feature has been removed. At any rate,
it’s not recommendable to use the Rand index, for similar reasons to the
Needleman-Wunsch algorithm, as discussed elsewhere. (Background:
Essentially the Rand index gives a score between 0 and 1, reflective of
how well aligned the criterion and predicted segments are. Different
adjustments can be applied, which you can read about in the CLUES
paper.)
ANOTHER NOTE: The result
slot also
includes some extra variables, such as signed lags. Two important
variables are rmse_prop and
aggregated_performance. The former expresses RMSE in
relative terms, i.e., as a value between 0% (pessimal RMSE, equal to the
spurious pairing threshold) and 100% (optimal RMSE, equal to 0). The
advantage of this metric (RMSE%) is that it puts RMSE on the
same scale as recall and precision, allowing all three of them to be
averaged into a single indicator of performance, i.e.,
aggregated_performance
. This approach is useful when a
single criterion is needed, e.g. for determining which algorithm
settings provide the best performance.
At this point, the TPM process may seem unnecessarily complicated:
get_transition_info
to obtain a
transition
object.summary
on the object.result
slot.Even with magrittr
pipes, this takes up some space:
suppressPackageStartupMessages(
library(magrittr, quietly = TRUE, verbose = FALSE)
)
<-
summarized get_transition_info(algorithm, criterion, 7) %>%
summary(.) %>%
slot("result")
Here’s why that level of separation is worthwhile: It makes it possible to combine objects or look at how different they are. Let’s say we ran our algorithm on two separate occasions.
# Here I'm exploiting seed changes to get different values from the same code
# I used previously
<- (sample(1:100)%%2)
algorithm2 <- (sample(1:100)%%2) criterion2
We may be interested in looking at the combined performance for both occasions. For that, we can add summary objects together.
<-
summarized2 get_transition_info(algorithm2, criterion2, 7) %>%
summary(.)
# Store the result of addition (another S4 summaryTransition object)
<- summarized1 + summarized2
added
# Now view the result
@result
added#> window_size reference_positives predicted_positives true_positives
#> 1 7 100 100 72
#> n_rejected_pairs mean_abs_lag_indices sd_abs_lag_indices
#> 1 16 0.375 0.6377226
#> mean_sd_abs_lag_indices mean_signed_lag_indices sd_signed_lag_indices
#> 1 0.4 ± 0.6 0.1527778 0.7250007
#> mean_sd_signed_lag_indices recall precision rmse_lag_indices rmse_prop
#> 1 0.2 ± 0.7 0.72 0.72 0.7 0.9
#> aggregated_performance
#> 1 0.78
# or:
# slot(added, "result")
Or we may be interested in looking at whether the algorithm performed
better on one occasion than the other. For that, we can subtract.
(NOTE: The subtraction method returns a list, not an S4
object, and the differences
element is the item of
interest.)
<- summarized1 - summarized2
subtracted
$differences
subtracted#> window_size diff_window_size diff_recall diff_precision
#> 1 7 0 -0.04 -0.04
#> diff_mean_abs_lag_indices diff_mean_signed_lag_indices diff_rmse_lag_indices
#> 1 -0.006949807 0.2030888 -0.1
#> diff_rmse_prop diff_aggregated_performance
#> 1 0.01428571 -0.02190476
It’s useful to run your analysis for multiple values of the spurious
pairing threshold. That can also be done conveniently through
PAutilities
. Let’s look at how performance changes (for our
original transitions
object, obtained when we first ran the
algorithm) when we use settings of 5-10.
<- spurious_curve(trans = transitions, thresholds = 5:10)
curve class(curve)
#> [1] "list" "spurious_curve"
sapply(curve, class)
#> [1] "summaryTransition" "summaryTransition" "summaryTransition"
#> [4] "summaryTransition" "summaryTransition" "summaryTransition"
That gives us a list of summaryTransition
objects for
each threshold setting. The list also inherits class
spurious_curve
, which has a convenient plot
method.
par(mar=rep(3,4))
plot(curve)
This vignette has provided a crash course in running the TPM in
different ways. Setting up a full analysis can still take some work, but
PAutilities
provides solid infrastructure to help you do so
in a controlled way. Suggested improvements are welcome. You can post
issues or pull requests on the package GitHub site. See
you there.