Maximally informative k-itemsets and their efficient discovery

16 years 7 months ago

Download www.cs.uu.nl

In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting from the total set of items an itemset of specified size, such that the database is partitioned with as uniform a distribution over the parts as possible. To achieve this goal, we propose the use of joint entropy as a quality measure for itemsets, and refer to optimal itemsets of cardinality k as maximally informative k-itemsets. We claim that this approach maximises distinctive power, as well as minimises redundancy within the feature set. A number of algorithms is presented for computing optimal itemsets efficiently. Categories and Subject Descriptors F.2 Analysis of Algorithms and Problem Complexity. G.3

Arno J. Knobbe, Eric K. Y. Ho

Real-time Traffic

Data Mining | Descriptors F.2 Analysis | KDD 2006 | Mining Binary Data | Optimal Itemsets |

claim paper

» Pointset algorithms for pattern discovery and pattern matching in music

» Mining indexing and querying historical spatiotemporal data

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2006
Where	KDD
Authors	Arno J. Knobbe, Eric K. Y. Ho

Comments (0)

Sciweavers

Maximally informative k-itemsets and their efficient discovery

Data Mining | Descriptors F.2 Analysis | KDD 2006 | Mining Binary Data | Optimal Itemsets |

Explore & Download

Productivity Tools

Sciweavers