spark.als {SparkR} | R Documentation |
spark.als
learns latent factors in collaborative filtering via alternating least
squares. Users can call summary
to obtain fitted latent factors, predict
to make predictions on new data, and write.ml
/read.ml
to save/load fitted models.
spark.als(data, ...) ## S4 method for signature 'SparkDataFrame' spark.als(data, ratingCol = "rating", userCol = "user", itemCol = "item", rank = 10, regParam = 0.1, maxIter = 10, nonnegative = FALSE, implicitPrefs = FALSE, alpha = 1, numUserBlocks = 10, numItemBlocks = 10, checkpointInterval = 10, seed = 0) ## S4 method for signature 'ALSModel' summary(object) ## S4 method for signature 'ALSModel' predict(object, newData) ## S4 method for signature 'ALSModel,character' write.ml(object, path, overwrite = FALSE)
data |
a SparkDataFrame for training. |
... |
additional argument(s) passed to the method. |
ratingCol |
column name for ratings. |
userCol |
column name for user ids. Ids must be (or can be coerced into) integers. |
itemCol |
column name for item ids. Ids must be (or can be coerced into) integers. |
rank |
rank of the matrix factorization (> 0). |
regParam |
regularization parameter (>= 0). |
maxIter |
maximum number of iterations (>= 0). |
nonnegative |
logical value indicating whether to apply nonnegativity constraints. |
implicitPrefs |
logical value indicating whether to use implicit preference. |
alpha |
alpha parameter in the implicit preference formulation (>= 0). |
numUserBlocks |
number of user blocks used to parallelize computation (> 0). |
numItemBlocks |
number of item blocks used to parallelize computation (> 0). |
checkpointInterval |
number of checkpoint intervals (>= 1) or disable checkpoint (-1). |
seed |
integer seed for random number generation. |
object |
a fitted ALS model. |
newData |
a SparkDataFrame for testing. |
path |
the directory where the model is saved. |
overwrite |
logical value indicating whether to overwrite if the output path already exists. Default is FALSE which means throw exception if the output path exists. |
For more details, see MLlib: Collaborative Filtering.
spark.als
returns a fitted ALS model.
summary
returns summary information of the fitted model, which is a list.
The list includes user
(the names of the user column),
item
(the item column), rating
(the rating column), userFactors
(the estimated user factors), itemFactors
(the estimated item factors),
and rank
(rank of the matrix factorization model).
predict
returns a SparkDataFrame containing predicted values.
spark.als since 2.1.0
summary(ALSModel) since 2.1.0
predict(ALSModel) since 2.1.0
write.ml(ALSModel, character) since 2.1.0
## Not run:
##D ratings <- list(list(0, 0, 4.0), list(0, 1, 2.0), list(1, 1, 3.0), list(1, 2, 4.0),
##D list(2, 1, 1.0), list(2, 2, 5.0))
##D df <- createDataFrame(ratings, c("user", "item", "rating"))
##D model <- spark.als(df, "rating", "user", "item")
##D
##D # extract latent factors
##D stats <- summary(model)
##D userFactors <- stats$userFactors
##D itemFactors <- stats$itemFactors
##D
##D # make predictions
##D predicted <- predict(model, df)
##D showDF(predicted)
##D
##D # save and load the model
##D path <- "path/to/model"
##D write.ml(model, path)
##D savedModel <- read.ml(path)
##D summary(savedModel)
##D
##D # set other arguments
##D modelS <- spark.als(df, "rating", "user", "item", rank = 20,
##D regParam = 0.1, nonnegative = TRUE)
##D statsS <- summary(modelS)
## End(Not run)