Sciweavers

VLDB
1998
ACM

Computing Iceberg Queries Efficiently

14 years 4 months ago
Computing Iceberg Queries Efficiently
Many applications compute aggregate functions over an attribute (or set of attributes) to find aggregate values above some specified threshold. We call such queries iceberg queries, because the number of abovethreshold results is often very small (the tip of an iceberg), relative to the large amount of input data (the iceberg). Such iceberg queries are common in many applications, including data warehousing, information-retrieval, market basket analysis in data mining, clustering and copy detection. We propose efficient algorithms to evaluate iceberg queries using very little memory and significantly fewer passes over data, when compared to current techniques that use sorting or hashing. We present an experimental case study using over three gigabytes of Web data to illustrate the savings obtained by our algorithms.
Min Fang, Narayanan Shivakumar, Hector Garcia-Moli
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1998
Where VLDB
Authors Min Fang, Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani, Jeffrey D. Ullman
Comments (0)