Sciweavers

PODS
1997
ACM

A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space

14 years 4 months ago
A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space
In this paper, we present a new cost model for nearest neighbor search in high-dimensional data space. We first analyze different nearest neighbor algorithms, present a generalization of an algorithm which has been originally proposed for Quadtrees [13], and show that this algorithm is optimal. Then, we develop a cost model which - in contrast to previous models - takes boundary effects into account and therefore also works in high dimensions. The advantages of our model are in particular: Our model works for data sets with an arbitrary number of dimensions and an arbitrary number of data points, is applicable to different data distributions and index structures, and provides accurate estimates of the expected query execution time. To show the practical relevance and accuracy of our model, we perform a detailed analysis using synthetic and real data. The results of applying our model to Hilbert and X-tree indices show that it provides a good estimation of the query performance, which ...
Stefan Berchtold, Christian Böhm, Daniel A. K
Added 07 Aug 2010
Updated 07 Aug 2010
Type Conference
Year 1997
Where PODS
Authors Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel
Comments (0)