Sciweavers

DSMML
2004
Springer

Redundant Bit Vectors for Quickly Searching High-Dimensional Regions

14 years 5 months ago
Redundant Bit Vectors for Quickly Searching High-Dimensional Regions
Applications such as audio fingerprinting require search in high dimensions: find an item in a database that is similar to a query. An important property of this search task is that negative answers are very frequent: much of the time, a query does not correspond to any database item. We propose Redundant Bit Vectors (RBVs): a novel method for quickly solving this search problem. RBVs rely on three key ideas: 1) approximate the high-dimensional regions/distributions as tightened hyperrectangles, 2) partition the query space to store each item redundantly in an index and 3) use bit vectors to store and search the index efficiently. We show that our method is the preferred method for very large databases or when the queries are often not in the database. Our method is 109 times faster than linear scan, and 48 times faster than localitysensitive hashing on a data set of 239369 audio fingerprints.
Jonathan Goldstein, John C. Platt, Christopher J.
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where DSMML
Authors Jonathan Goldstein, John C. Platt, Christopher J. C. Burges
Comments (0)