Sampling-based estimators for subset-based queries

16 years 7 months ago

Download www.cise.ufl.edu

We consider the problem of using sampling to estimate the result of an aggregation operation over a subset-based SQL query, where a subquery is correlated to an outer query by a NOT EXISTS, NOT IN, EXISTS or IN clause. We design an unbiased estimator for our query and prove that it is indeed unbiased. We then provide a second, biased estimator that makes use of the superpopulation concept from statistics to minimize the mean squared error of the resulting estimate. The two estimators are tested over an extensive set of experiments. Keywords Sampling, Approximate Query Processing, Aggregate Query Processing

Shantanu Joshi, Christopher M. Jermaine

Real-time Traffic

Aggregate Query Processing | Approximate Query Processing | Database | Subset-based Sql Query | VLDB 2009 |

claim paper

» New SamplingBased Estimators for OLAP Queries

» New SamplingBased Summary Statistics for Improving Approximate Query Answers

» A Latent Dirichlet Framework for Relevance Modeling

Post Info
More Details (n/a)

Added	05 Dec 2009
Updated	05 Dec 2009
Type	Conference
Year	2009
Where	VLDB
Authors	Shantanu Joshi, Christopher M. Jermaine

Comments (0)

Sciweavers

Sampling-based estimators for subset-based queries

Aggregate Query Processing | Approximate Query Processing | Database | Subset-based Sql Query | VLDB 2009 |

Explore & Download

Productivity Tools

Sciweavers