Tables

Robin J. Evans

2021-01-26

This package contains methods for storing and manipulating collections of contingency tables, and for easily vectorizing functions which apply to a contingency table.

The basis of this is the class of object tables, which contains a collection of numerical tables all of the same dimension. Let’s create a collection of 10 contingency tables (in this case probability tables), each of dimension 2x2x2.

library(contingency)
## Loading required package: rje
tab <- rprobMat(10, 2, 3)
tab
## Group of 10 numeric tables of dimension 2x2x2
## First entry:
## , , 1
## 
##            [,1]       [,2]
## [1,] 0.06882546 0.05337011
## [2,] 0.01401081 0.01098293
## 
## , , 2
## 
##            [,1]      [,2]
## [1,] 0.21224108 0.3154456
## [2,] 0.02320423 0.3019198

The print method shows the first table in the list.
The tables are stored as a matrix as can be seen by using the dim() function. Accessing particular rows of this matrix return the appropriate tables:

tab[c(1,4,5),]
## Group of 3 numeric tables of dimension 2x2x2
## First entry:
## , , 1
## 
##            [,1]       [,2]
## [1,] 0.06882546 0.05337011
## [2,] 0.01401081 0.01098293
## 
## , , 2
## 
##            [,1]      [,2]
## [1,] 0.21224108 0.3154456
## [2,] 0.02320423 0.3019198

However we can also specific elements of the tables using their co-ordinates, and (optionally) leaving the first entry blank:

tab[,1,1,]
## Group of 10 numeric tables of dimension 2
## First entry:
## [1] 0.06882546 0.21224108

The drop argument can be set to FALSE if dimensions of length 1 should be retained:

tab[,1,1,,drop=FALSE]
## Group of 10 numeric tables of dimension 1x1x2
## First entry:
## , , 1
## 
##            [,1]
## [1,] 0.06882546
## 
## , , 2
## 
##           [,1]
## [1,] 0.2122411

Basic Manipulations

Some basic operations are predefined, such as taking the margin of each table, or calculating a conditional distribution.

margin(tab, 2:3)         # margin of second and third dimensions
## Group of 10 numeric tables of dimension 2x2
## First entry:
##            [,1]      [,2]
## [1,] 0.08283627 0.2354453
## [2,] 0.06435303 0.6173654
conditional(tab, 2, 1)  # second dimension conditional on first
## Group of 10 numeric tables of dimension 2x2
## First entry:
##           [,1]      [,2]
## [1,] 0.4324884 0.1062929
## [2,] 0.5675116 0.8937071

These can also be applied on an ordinary numerical array with the expected effect. It can also be useful to calcuate conditional or other functions but retain the placement of values in the same point as the original table. For this purpose the functions margin2() and conditional2() are available.

                         # as above but sequence of cells
margin2(tab, 2:3)        # in table is retained
## Group of 10 numeric tables of dimension 2x2x2
## First entry:
## , , 1
## 
##            [,1]       [,2]
## [1,] 0.08283627 0.06435303
## [2,] 0.08283627 0.06435303
## 
## , , 2
## 
##           [,1]      [,2]
## [1,] 0.2354453 0.6173654
## [2,] 0.2354453 0.6173654
conditional2(tab, 2, 1)  
## Group of 10 numeric tables of dimension 2x2x2
## First entry:
## , , 1
## 
##           [,1]      [,2]
## [1,] 0.4324884 0.5675116
## [2,] 0.1062929 0.8937071
## 
## , , 2
## 
##           [,1]      [,2]
## [1,] 0.4324884 0.5675116
## [2,] 0.1062929 0.8937071

Functions of Distributions

Some built-in functions are available. For example:

tab2 <- rprobMat(10,2,3)
kl(tab, tab2)   # pairwise Kullback-Leibler divergence
##  [1] 1.7876734 0.8878977 0.9273411 1.0505738 0.9644997 1.5125273 1.4185864
##  [8] 0.9040103 0.3541266 1.3874266
                       # mutual information between
mutualInf(tab, 2, 3)   # second and third dimensions
##  [1] 0.0222188296 0.0006586847 0.1097037171 0.0120777172 0.0901299457
##  [6] 0.0374731115 0.0141781905 0.0002883694 0.0016497155 0.0001017643
mutualInf(tab, 2, 3, cond=1)   # conditional mutual information
##  [1] 0.0230257978 0.0080745807 0.1440390623 0.2507592964 0.1161879554
##  [6] 0.0042213826 0.0428370801 0.0004402263 0.0143184393 0.0113316165