We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...
We study an important data analysis operator, which extracts the k most important groups from data (i.e., the k groups with the highest aggregate values). In a data warehousing co...
Summaries of massive data sets support approximate query processing over the original data. A basic aggregate over a set of records is the weight of subpopulations specified as a ...
Large-scale data analysis has become increasingly important for many enterprises. Recently, a new distributed computing paradigm, called MapReduce, and its open source implementat...
When similarity queries over multimedia databases are processed by splitting the overall query condition into a set of sub-queries, the problem of how to efficiently and effectiv...
Ilaria Bartolini, Paolo Ciaccia, Vincent Oria, M. ...