Sciweavers

ICDM
2005
IEEE

An Algorithm for In-Core Frequent Itemset Mining on Streaming Data

14 years 6 months ago
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data
Frequent itemset mining is a core data mining operation and has been extensively studied over the last decade. This paper takes a new approach for this problem and makes two major contributions. First, we present a one pass algorithm for frequent itemset mining, which has deterministic bounds on the accuracy, and does not require any out-of-core summary structure. Second, because our one pass algorithm does not produce any false negatives, it can be easily extended to a two pass accurate algorithm. Our two pass algorithm is very memory efficient, and allows mining of datasets with large number of distinct items and/or very low support levels. Our detailed experimental evaluation on synthetic and real datasets shows the following. First, our one pass algorithm is very accurate in practice. Second, our algorithm requires significantly lower memory than Manku and Motwani’s one pass algorithm and the multi-pass apriori algorithm. Our two pass algorithm outperforms apriori and FP-tree ...
Ruoming Jin, Gagan Agrawal
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where ICDM
Authors Ruoming Jin, Gagan Agrawal
Comments (0)