crosstab {SparkR} | R Documentation |
Computes a pair-wise frequency table of the given columns. Also known as a contingency table. The number of distinct values for each column should be less than 1e4. At most 1e6 non-zero pair frequencies will be returned.
## S4 method for signature 'SparkDataFrame,character,character' crosstab(x, col1, col2)
x |
a SparkDataFrame |
col1 |
name of the first column. Distinct items will make the first item of each row. |
col2 |
name of the second column. Distinct items will make the column names of the output. |
a local R data.frame representing the contingency table. The first column of each row
will be the distinct values of col1
and the column names will be the distinct
values of col2
. The name of the first column will be "col1
_col2
".
Pairs that have no occurrences will have zero as their counts.
crosstab since 1.5.0
Other stat functions:
approxQuantile()
,
corr()
,
cov()
,
freqItems()
,
sampleBy()
## Not run:
##D df <- read.json("/path/to/file.json")
##D ct <- crosstab(df, "title", "gender")
## End(Not run)