Internals of multiple response objects

Thomas Lumley

25/07/2022

There are three main classes in the rimu package: two implementations of multiple-response objects and one of multiple-score objects. Multiple-response objects generalise factors. Like factors, they have a set of permitted levels; unlike factors, it’s possible for zero or more than one of the levels to be observed. Multiple responses have the same relationship to check boxes that factors have to radio buttons.

The main class represents multiple-response objects internally by a logical matrix. Each row has a logical vector, with an entry corresponding to each level of the factor: TRUE if that level was observed, FALSE if it was not (and NA if we don’t know). The objects behave as a single S3 vector in many ways: they can be a single data frame column, the length function returns the number of rows, they print as a single character vector.

library(rimu)
data(ethnicity)
ethnicity
## [1] "European"              "European+Maori"        "Maori"                
## [4] "Maori+Pacific"         "?European+?Maori"      "European+?Maori+Asian"
length(ethnicity)
## [1] 6
unclass(ethnicity)
##      European Maori Pacific Asian MELAA
## [1,]     TRUE FALSE   FALSE FALSE FALSE
## [2,]     TRUE  TRUE   FALSE FALSE FALSE
## [3,]    FALSE  TRUE   FALSE FALSE FALSE
## [4,]    FALSE  TRUE    TRUE FALSE FALSE
## [5,]       NA    NA   FALSE FALSE FALSE
## [6,]     TRUE    NA   FALSE  TRUE FALSE
data.frame(ethnicity)
##        ethnicity
## 1       European
## 2 European+Maori
## 3          Maori
## 4  Maori+Pacific
## 5               
## 6 European+Asian

The class also creates a new S3 vector type, but does it using the package from the tidyverse. This is necessary (or at least the easiest way) to include multiple-response objects in tidyverse tbl_df objects (tibbles). The class requires the package to work, and is only really useful if you have the and packages as well (as even a fairly minimal tidyverse installation will do). Internally, a object is a list of character vectors, plus an attribute specifying the permitted levels. It’s built on the vctrs_list_of class. Unfortunately, this representation does not allow for “don’t know” membership.

eth<-as.vmr(ethnicity, na.rm=TRUE)
eth
## <vmultiresp[6]>
## [1] European       European+Maori Maori          Maori+Pacific                
## [6] Asian+European
length(eth)
## [1] 6
unclass(eth)
## [[1]]
## [1] "European"
## 
## [[2]]
## [1] "European" "Maori"   
## 
## [[3]]
## [1] "Maori"
## 
## [[4]]
## [1] "Maori"   "Pacific"
## 
## [[5]]
## character(0)
## 
## [[6]]
## [1] "European" "Asian"   
## 
## attr(,"levs")
## [1] "European" "Maori"    "Pacific"  "Asian"    "MELAA"   
## attr(,"ptype")
## character(0)

All the same functions are available for and objects, though most of the methods work by converting to and back again.

Finally, objects have a non-zero numeric score for each object. This might be a rank (list the first three birds you saw) or a monetary amount (how much did you spend on…). Internally, they are represented as numeric matrices.

data(nzbirds)
nzbirds
##      kea ruru tui tauhou kaki
## [1,] 1   .    2   .      .   
## [2,] 1   2    .   .      3   
## [3,] .   1    .   .      .   
## [4,] .   2    1   .      .   
## [5,] 2   3    1   .      4   
## [6,] 2   <NA> .   1      .
d<-data.frame(nzbirds)
dim(d)
## [1] 6 1
d
##             nzbirds
## 1           kea+tui
## 2     kea+ruru+kaki
## 3              ruru
## 4          ruru+tui
## 5 kea+ruru+tui+kaki
## 6        kea+tauhou
as.mr(nzbirds)
## [1] "kea+tui"           "kea+ruru+kaki"     "ruru"             
## [4] "ruru+tui"          "kea+ruru+tui+kaki" "kea+?ruru+tauhou"