Removedor de Sufixos da Língua Portuguesa
This package uses the algorithm Stemming Algorithm for the Portuguese Language described in this article by Viviane Moreira Orengo and Christian Huyck.
The idea of the stemmer is very well explained by the following schema.
To install the package you can use the following:
The only important function of the package is the rslp
function. You can call it on a vector of characters like this:
library(rslp)
words <- c("balões", "aviões", "avião", "gostou", "gosto", "gostaram")
rslp(words)
#> [1] "bal" "avi" "avi" "gost" "gost" "gost"
It works with vector of texts too, using the rslp_doc
function.