Sciweavers

VLDB
2004
ACM

CORDS: Automatic Generation of Correlation Statistics in DB2

14 years 5 months ago
CORDS: Automatic Generation of Correlation Statistics in DB2
When query optimizers erroneously assume that database columns are statistically independent, they can underestimate the selectivities of conjunctive predicates by orders of magnitude. Such underestimation often leads to drastically suboptimal query execution plans. We demonstrate cords, an efficient and scalable tool for automatic discovery of correlations and soft functional dependencies between column pairs. We apply cords to real, synthetic, and TPC-H benchmark data, and show that cords discovers correlations in an efficient and scalable manner. The output of cords can be visualized graphically, making cords a useful mining and analysis tool for database administrators. cords ranks the discovered correlated column pairs and recommends to the optimizer a set of statistics to collect for the “most important” of the pairs. Use of these statistics speeds up processing times by orders of magnitude for a wide range of queries.
Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul G
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where VLDB
Authors Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul G. Brown, Ashraf Aboulnaga
Comments (0)