Sciweavers

SIGMOD
1998
ACM

New Sampling-Based Summary Statistics for Improving Approximate Query Answers

14 years 3 months ago
New Sampling-Based Summary Statistics for Improving Approximate Query Answers
In large data recording and warehousing environments, it is often advantageous to provide fast, approximate answers to queries, whenever possible. Before DBMSs providing highly-accurate approximate answers can become a reality, many new techniques for summarizing data and for estimating answers from summarized data must be developed. This paper introduces two new samplingbased summary statistics, concise samples and counting samples, and presents new techniques for their fast incremental maintenance regardless of the data distribution. We quantify their advantages over standard sample views in terms of the number of additional sample points for the same view size, and hence in providing more accurate query answers. Finally, we consider their application to providing fast approximate answers to hot list queries. Our algorithms maintain their accuracy in the presence of ongoing insertions to the data warehouse.
Phillip B. Gibbons, Yossi Matias
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where SIGMOD
Authors Phillip B. Gibbons, Yossi Matias
Comments (0)