Sciweavers

KDD
2004
ACM

Dense itemsets

15 years 25 days ago
Dense itemsets
Frequent itemset mining has been the subject of a lot of work in data mining research ever since association rules were introduced. In this paper we address a problem with frequent itemsets: that they only count rows where all their attributes are present, and do not allow for any noise. We show that generalizing the concept of frequency while preserving the performance of mining algorithms is nontrivial, and introduce a generalization of frequent itemsets, dense itemsets. Dense itemsets do not require all attributes to be present at the same time; instead, the itemset needs to define a sufficiently large submatrix that exceeds a given density threshold of attributes present. We consider the problem of computing all dense itemsets in a database. We give a levelwise algorithm for this problem, and also study the top-k variations, i.e., finding the k densest sets with a given support, or the k bestsupported sets with a given density. These algorithms select the other parameter automatic...
Heikki Mannila, Jouni K. Seppänen
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors Heikki Mannila, Jouni K. Seppänen
Comments (0)