freqItems {SparkR} | R Documentation |
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in http://dx.doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.
## S4 method for signature 'SparkDataFrame,character' freqItems(x, cols, support = 0.01)
x |
A SparkDataFrame. |
cols |
A vector column names to search frequent items in. |
support |
(Optional) The minimum frequency for an item to be considered |
a local R data.frame with the frequent items in each column
freqItems since 1.6.0
Other stat functions:
approxQuantile()
,
corr()
,
cov()
,
crosstab()
,
sampleBy()
## Not run:
##D df <- read.json("/path/to/file.json")
##D fi = freqItems(df, c("title", "gender"))
## End(Not run)