Sciweavers

FIMI
2003

Probabilistic Iterative Expansion of Candidates in Mining Frequent Itemsets

14 years 1 months ago
Probabilistic Iterative Expansion of Candidates in Mining Frequent Itemsets
A simple new algorithm is suggested for frequent itemset mining, using item probabilities as the basis for generating candidates. The method first finds all the frequent items, and then generates an estimate of the frequent sets, assuming item independence. The candidates are stored in a trie where each path from the root to a node represents one candidate itemset. The method expands the trie iteratively, until all frequent itemsets are found. Expansion is based on scanning through the data set in each iteration cycle, and extending the subtries based on observed node frequencies. Trie probing can be restricted to only those nodes which possibly need extension. The number of candidates is usually quite moderate; for dense datasets 2-4 times the number of final frequent itemsets, for non-dense sets somewhat more. In practical experiments the method has been observed to make clearly fewer passes than the well-known Apriori method. As for speed, our non-optimised implementation is in som...
Attila Gyenesei, Jukka Teuhola
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where FIMI
Authors Attila Gyenesei, Jukka Teuhola
Comments (0)