The paper presents a method for pruning frequent itemsets based on background knowledge represented by a Bayesian network. The interestingness of an itemset is defined as the absolute difference between its support estimated from data and from the Bayesian network. Efficient algorithms are presented for finding interestingness of a collection of frequent itemsets, and for finding all attribute sets with a given minimum interestingness. Practical usefulness of the algorithms and their efficiency have been verified experimentally. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications-data mining General Terms Algorithms Keywords association rule, frequent itemset, background knowledge, interestingness, Bayesian network
Szymon Jaroszewicz, Dan A. Simovici