Sciweavers

KDD
2007
ACM

Mining statistically important equivalence classes and delta-discriminative emerging patterns

14 years 6 months ago
Mining statistically important equivalence classes and delta-discriminative emerging patterns
The support-confidence framework is the most common measure used in itemset mining algorithms, for its antimonotonicity that effectively simplifies the search lattice. This computational convenience brings both quality and statistical flaws to the results as observed by many previous studies. In this paper, we introduce a novel algorithm that produces itemsets with ranked statistical merits under sophisticated test statistics such as chi-square, risk ratio, odds ratio, etc. Our algorithm is based on the concept of equivalence classes. An equivalence class is a set of frequent itemsets that always occur together in the same set of transactions. Therefore, itemsets within an equivalence class all share the same level of statistical significance regardless of the variety of test statistics. As an equivalence class can be uniquely determined and concisely represented by a closed pattern and a set of generators, we just mine closed patterns and generators, taking a simultaneous depth-...
Jinyan Li, Guimei Liu, Limsoon Wong
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where KDD
Authors Jinyan Li, Guimei Liu, Limsoon Wong
Comments (0)