Sciweavers

VLDB
2009
ACM

Guessing the extreme values in a data set: a Bayesian method and its applications

15 years 20 days ago
Guessing the extreme values in a data set: a Bayesian method and its applications
For a largenumber of data management problems, it would be very useful to be able to obtain a few samples from a data set, and to use the samples to guess the largest (or smallest) value in the entire data set. Min/max online aggregation, Top-k query processing, outlier detection, and distance join are just a few possible applications. This paper details a statistically rigorous, Bayesian approach to attacking this problem. Just as importantly, we demonstrate the utility of our approach by showing how it can be applied to four specific problems that arise in the context of data management. Keywords Sampling ? Online aggregation ? Monte Carlo ? Extreme values ? Bayesian
Mingxi Wu, Chris Jermaine
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Where VLDB
Authors Mingxi Wu, Chris Jermaine
Comments (0)