We study selectivity estimation techniques for set similarity queries. A wide variety of similarity measures for sets have been proposed in the past. In this work we concentrate o...
Marios Hadjieleftheriou, Xiaohui Yu, Nick Koudas, ...
We consider the problem of estimating CPU (distance computations) and I/O costs for processing range and k-nearest neighbors queries over metric spaces. Unlike the specific case ...
We introduce a cost model for the M-tree access method [Ciaccia et al., 1997] which provides estimates of CPU (distance computations) and I/O costs for the execution of similarity ...
Large-scale distributed data management with P2P systems requires the existence of similarity operators for queries as we cannot assume that all users will agree on exactly the sa...
We present PROUD - A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast t...
Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, Ming-Syan C...