Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O p...
Abstract—We propose a strategy to perform query processing on P2P similarity search systems based on peers and superpeers. We show that by approximating global but resumed inform...
Approximate queries on a collection of strings are important in many applications such as record linkage, spell checking, and Web search, where inconsistencies and errors exist in...
The grand tour, one of the most popular methods for multidimensional data exploration, is based on orthogonally projecting multidimensional data to a sequence of lower dimensional...
Many time series data mining problems require subsequence similarity search as a subroutine. While this can be performed with any distance measure, and dozens of distance measures ...
Doruk Sart, Abdullah Mueen, Walid A. Najjar, Eamon...