Mining Compressed Frequent-Pattern Sets

14 years 6 months ago

Download www.se.cuhk.edu.hk

A major challenge in frequent-pattern mining is the sheer size of its mining results. In many cases, a high min sup threshold may discover only commonsense patterns but a low one may generate an explosive number of output patterns, which severely restricts its usage. In this paper, we study the problem of compressing frequent-pattern sets. Typically, frequent patterns can be clustered with a tightness measure δ (called δ-cluster), and a representative pattern can be selected for each cluster. Unfortunately, ﬁnding a minimum set of representative patterns is NP-Hard. We develop two greedy methods, RPglobal and RPlocal. The former has the guaranteed compression bound but higher computational complexity. The latter sacriﬁces the theoretical bounds but is far more eﬃcient. Our performance study shows that the compression quality using RPlocal is very close to RPglobal, and both can reduce the number of closed frequent patterns by almost two orders of magnitude. Furthermore, RPloca...

Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng

Real-time Traffic

Database | Frequent Patterns | Min Sup Threshold | Representative Patterns | VLDB 2005 |

claim paper

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	VLDB
Authors	Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng

Comments (0)

Sciweavers

Mining Compressed Frequent-Pattern Sets

Database | Frequent Patterns | Min Sup Threshold | Representative Patterns | VLDB 2005 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers