A well-known problem that limits the practical usage of association rule mining algorithms is the extremely large number of rules generated. Such a large number of rules makes the algorithms inefficient and makes it difficult for the end users to comprehend the discovered rules. We present the concept of a heavy itemset. An itemset A is heavy (for given support and confidence values) if all possible association rules made up of items only in A are present. We prove a simple necessary and sufficient condition for an itemset to be heavy. We present a formula for the number of possible rules for a given heavy itemset, and show that a heavy itemset compactly represents an exponential number of association rules. We present an efficient greedy algorithm to generate a collection of disjoint heavy itemsets in a given transaction database. We then present a modified apriori algorithm that uses heavy items and detects more heavy itemsets, not necessarily disjoint with the given ones.
Girish Keshav Palshikar, Mandar S. Kale, Manoj M.