Sciweavers

VLDB
1998
ACM

Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases

14 years 3 months ago
Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases
We consider the problem of finding association rules that make nearly optimal binary segmentations of huge categorical databases. The optimality of segmentation is defined by an objective function suitable for the user's objective. An objective function is usually defined in terms of the distribution of a given target attribute. Our goal is to find association rules that split databases into two subsets, optimizing the value of an objective function. The problem is intractable for general objective functions, because letting N be the number of records of a given database, there are 2N possible binary segmentations, and we may have to exhaustively examine all of them. However, when the objective function is convex, there are feasible algorithms for finding nearly optimal binary segmentations, and we prove that typical criteria, such as "entropy (mutual information)," "x2 (correlation) ," and "gini index (mean squared error) ," are actually convex. We ...
Yasuhiko Morimoto, Takeshi Fukuda, Hirofumi Matsuz
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1998
Where VLDB
Authors Yasuhiko Morimoto, Takeshi Fukuda, Hirofumi Matsuzawa, Takeshi Tokuyama, Kunikazu Yoda
Comments (0)