Sciweavers

SIGMOD
2009
ACM

Estimating the confidence of conditional functional dependencies

14 years 11 months ago
Estimating the confidence of conditional functional dependencies
Conditional functional dependencies (CFDs) have recently been proposed as extensions of classical functional dependencies that apply to a certain subset of the relation, as specified by a pattern tableau. Calculating the support and confidence of a CFD (i.e., the size of the applicable subset and the extent to which it satisfies the CFD) gives valuable information about data semantics and data quality. While computing the support is easier, computing the confidence exactly is expensive if the relation is large, and estimating it from a random sample of the relation is unreliable unless the sample is large. We study how to efficiently estimate the confidence of a CFD with a small number of passes (one or two) over the input using small space. Our solutions are based on a variety of sampling and sketching techniques, and apply when the pattern tableau is known in advance, and also the harder case when this is given after the data have been seen. We analyze our algorithms, and show that ...
Graham Cormode, Lukasz Golab, Flip Korn, Andrew Mc
Added 05 Dec 2009
Updated 05 Dec 2009
Type Conference
Year 2009
Where SIGMOD
Authors Graham Cormode, Lukasz Golab, Flip Korn, Andrew McGregor, Divesh Srivastava, Xi Zhang
Comments (0)