This paper examines the problem of database organization and retrieval based on computing metric pairwise distances. A low-dimensional Euclidean approximation of a high-dimensional metric space is not efficient, while search in a high-dimensional Euclidean space suffers from the "curse of dimensionality". Thus, techniques designed for searching metric spaces must be used. We evaluate several such existing exact metric-based indexing techniques, and show that they require extensive computational effort. This motivates the development of an approximate nearest neighbor search technique where the ? nearest neighbors are used to approximate the local neighborhood of a point. The resulting ? NN graph is searched in a best-first fashion producing excellent indexing efficiency.
Thomas B. Sebastian, Benjamin B. Kimia