A package for precise approximative nearest neighbor search in more than just euclidean space.
Its only exported function find_knn
computes the k
nearest neighbors of the rows of the query
matrix in the data
matrix. If no query
matrix is passed, the nearest neighbors for all rows in the data will be returned (i.e. data
will be used as query
).
The result will be a list containing
index
, a nrow(query)
× k
integer matrix containing the row indices into data
that are the nearest neighbors.
dist
, a nrow(query)
× k
double matrix containing the distance
s to those neighbors.
dist_mat
, a nrow(query)
× nrow(data)
a Matrix::dSparseMatrix
, generic if !sym
or !is.null(query)
, and symmetric if sym
and is.null(query)
. Zeros in this matrix mean “not a knn”, and if sym
is set, the matrix will be post processed to be symmetric.
(Without post processing, the matrix will likely be asymmetric as r1∈kNN(r2)
does not imply r2∈knn(r1)
)
This package was separated from destiny as it might prove helpful in other contexts. It provides more distance metrics than FNN and is more precise than RcppHNSW, but slower than both.
If anyone knows a faster and similarly precise kNN search in cosine (=rank correlation) space, please tell me!