Sciweavers

PODS
1998
ACM

A Cost Model for Similarity Queries in Metric Spaces

14 years 3 months ago
A Cost Model for Similarity Queries in Metric Spaces
We consider the problem of estimating CPU (distance computations) and I/O costs for processing range and k-nearest neighbors queries over metric spaces. Unlike the specific case of vector spaces, where information on data distribution has been exploited to derive cost models for predicting the performance of multi-dimensional access methods, in a generic metric space there is no such a possibility, which makes the problem quite different and requires a novel approach. We insist that the distance distribution of objects can be profitably used to solve the problem, and consequently develop a concrete cost model for the M-tree access method [10]. Our results rely on the assumption that the indexed dataset comes from a metric space which is “homogeneous” enough (in a probabilistic sense) to allow reliable cost estimations even if the distance distribution with respect to a specific query object is unknown. We experimentally validate the model over both real and synthetic datasets, ...
Paolo Ciaccia, Marco Patella, Pavel Zezula
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where PODS
Authors Paolo Ciaccia, Marco Patella, Pavel Zezula
Comments (0)