The goal of nntrf is to transform datasets from their original feature space to the space defined by the activations of the hidden layer of a 3-layer Multi-layer Perceptron. This is done by training a neural network and then computing the activations of the neural network for each input pattern. Package nnet is used under the hood for this purpose. It is a supervised transformation because it results from solving a supervised problem, as opposed (for instance) to Principal Component Analysis.
The following example shows how to transform the iris dataset, from the original 4-dimension space, into a 2-dimension space by means of nntrf. For nntrf, if the dependent variable is a factor (like for iris), then it is considered a classification problem, otherwise (if it is a number), then it is considered a regression problem.
iris <- NULL
data("iris", envir = environment())
rd <- iris
n <- nrow(rd)
# Species is already a factor. Conversion is here to remark that for classification problems
# the dependent variable must be a factor.
rd$Species <- as.factor(rd$Species)
set.seed(0)
training_index <- sample(1:n, round(0.6*n))
# Get training and test data
train <- rd[training_index,]
test <- rd[-training_index,]
x_train <- as.matrix(train[,-ncol(train)])
y_train <- train[,ncol(train)]
x_test <- as.matrix(test[,-ncol(test)])
y_test <- test[,ncol(test)]
# Now, use nntrf to transform the original 4-dim space into a 2-dim space (size=2)
# First, we train the neural network with 2 hidden neurons
set.seed(0)
nnpo <- nntrf(formula=Species ~. ,
data=train,
size=2, maxit=140, trace=TRUE)
# Second, we transform the dataset using the weights of the hidden layer
trf_x_train <- nnpo$trf(x=x_train,use_sigmoid=FALSE)
trf_x_test <- nnpo$trf(x=x_test,use_sigmoid=FALSE)
# It can be seen that the new feature space is 2-dimensional
print(dim(trf_x_train))
# Third, KNN is used to classify the dataset on the transformed space
outputs <- FNN::knn(trf_x_train, trf_x_test, train$Species)
success <- mean(outputs == test$Species)
cat(paste0("Success rate of KNN (K=1) with iris transformed by nntrf ", success, "\n"))
# For comparison purposes, next KNN is used to classify the iris dataset on the original space
set.seed(0)
outputs <- FNN::knn(x_train, x_test, train$Species)
success <- mean(outputs == test$Species)
cat(paste0("Success rate of KNN (K=1) with original iris ", success, "\n"))