Sciweavers

EDBT
2009
ACM

Distributed similarity search in high dimensions using locality sensitive hashing

14 years 6 months ago
Distributed similarity search in high dimensions using locality sensitive hashing
In this paper we consider distributed K-Nearest Neighbor (KNN) search and range query processing in high dimensional data. Our approach is based on Locality Sensitive Hashing (LSH) which has proven very efficient in answering KNN queries in centralized settings. We consider mappings from the multi-dimensional LSH bucket space to the linearly ordered set of peers that jointly maintain the indexed data and derive requirements to achieve high quality search results and limit the number of network accesses. We put forward two such mappings that come with these salient properties: being locality preserving so that buckets likely to hold similar data are stored on the same or neighboring peers and having a predictable output distribution to ensure fair load balancing. We show how to leverage the linearly aligned data for efficient KNN search and how to efficiently process range queries which is, to the best of our knowledge, not possible in existing LSH schemes. We show by comprehensive per...
Parisa Haghani, Sebastian Michel, Karl Aberer
Added 19 May 2010
Updated 19 May 2010
Type Conference
Year 2009
Where EDBT
Authors Parisa Haghani, Sebastian Michel, Karl Aberer
Comments (0)