Skip to contents

Search for approximate nearest neighbors between cells in the reduced dimensions (e.g. PCA), and return the k nearest neighbors (knn) for each cell. Optionally, we can find neighbors between two separate sets of cells by utilizing both data and query.

Usage

knn_hnsw(
  data,
  query = NULL,
  k = 10,
  metric = c("euclidean", "cosine"),
  verbose = TRUE,
  threads = 1,
  ef = 100
)

knn_annoy(
  data,
  query = data,
  k = 10,
  metric = c("euclidean", "cosine", "manhattan", "hamming"),
  n_trees = 50,
  search_k = -1
)

Arguments

data

cell x dims matrix for reference dataset

query

cell x dims matrix for query dataset (optional)

k

number of neighbors to calculate

metric

distance metric to use

verbose

whether to print progress information during search

threads

Number of threads to use. Note that result is non-deterministic if threads > 1

ef

ef parameter for RccppHNSW::hnsw_search. Increase for slower search but improved accuracy

n_trees

Number of trees during index build time. More trees gives higher accuracy

search_k

Number of nodes to inspect during the query, or -1 for default value. Higher number gives higher accuracy

Value

List of 2 matrices -- idx for cell x K neighbor indices, dist for cell x K neighbor distances. If no query is given, nearest neighbors are found mapping the data matrix to itself, prohibiting self-neighbors

Details

knn_hnsw: Use RcppHNSW as knn engine

knn_annoy: Use RcppAnnoy as knn engine