Sciweavers

VLDB
2005
ACM

Mining Compressed Frequent-Pattern Sets

14 years 6 months ago
Mining Compressed Frequent-Pattern Sets
A major challenge in frequent-pattern mining is the sheer size of its mining results. In many cases, a high min sup threshold may discover only commonsense patterns but a low one may generate an explosive number of output patterns, which severely restricts its usage. In this paper, we study the problem of compressing frequent-pattern sets. Typically, frequent patterns can be clustered with a tightness measure δ (called δ-cluster), and a representative pattern can be selected for each cluster. Unfortunately, finding a minimum set of representative patterns is NP-Hard. We develop two greedy methods, RPglobal and RPlocal. The former has the guaranteed compression bound but higher computational complexity. The latter sacrifices the theoretical bounds but is far more efficient. Our performance study shows that the compression quality using RPlocal is very close to RPglobal, and both can reduce the number of closed frequent patterns by almost two orders of magnitude. Furthermore, RPloca...
Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where VLDB
Authors Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng
Comments (0)