Sciweavers

VLDB
1999
ACM

Similarity Search in High Dimensions via Hashing

14 years 3 months ago
Similarity Search in High Dimensions via Hashing
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In ...
Aristides Gionis, Piotr Indyk, Rajeev Motwani
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1999
Where VLDB
Authors Aristides Gionis, Piotr Indyk, Rajeev Motwani
Comments (0)