Discovering Frequent Itemsets in the Presence of Highly Frequent Items

14 years 6 months ago

Download www.informatics.indiana.edu

This paper presents new techniques for focusing the discoveryof frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds signi cantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics di erent from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.

Dennis P. Groth, Edward L. Robertson

Real-time Traffic

Discoveryof Frequent Itemsets | Frequent Items | Frequent Itemsets | INAP 2001 | Information Management |

claim paper

Post Info
More Details (n/a)

Added	30 Jul 2010
Updated	30 Jul 2010
Type	Conference
Year	2001
Where	INAP
Authors	Dennis P. Groth, Edward L. Robertson

Comments (0)

Sciweavers

Discovering Frequent Itemsets in the Presence of Highly Frequent Items

Discoveryof Frequent Itemsets | Frequent Items | Frequent Itemsets | INAP 2001 | Information Management |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers