Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

95

Voted

ICDM
2003
IEEE

favoriteEmaildiscussreport

130views Data Mining» more ICDM 2003»

Information Theoretic Clustering of Sparse Co-Occurrence Data

15 years 6 months ago

Information Theoretic Clustering of Sparse Co-Occurrence Data

Download userweb.cs.utexas.edu

A novel approach to clustering co-occurrence data poses it as an optimization problem in information theory which minimizes the resulting loss in mutual information. A divisive clustering algorithm that monotonically reduces this loss function was recently proposed. In this paper we show that sparse high-dimensional data presents special challenges which can result in the algorithm getting stuck at poor local minima. We propose two solutions to this problem: (a) a “prior” to overcome inﬁnite relative entropy values as in the supervised Naive Bayes algorithm, and (b) local search to escape local minima. Finally, we combine these solutions to get a robust algorithm that is computationally efﬁcient. We present experimental results to show that the proposed method is effective in clustering document collections and outperforms previous information-theoretic clustering approaches.

Inderjit S. Dhillon, Yuqiang Guan

Real-time Traffic

Data Mining | Divisive Clustering Algorithm | ICDM 2003 | Local Minima | Supervised Naive Bayes |

claim paper

Related Content

» Information Theoretic AngleBased Spectral Clustering A Theoretical Analysis and an Algorit...

» Conceptualization of place via spatial clustering and cooccurrence analysis

» Nonparametric Information Theoretic Clustering Algorithm

» Cluster Stability Assessment Based on Theoretic Information Measures

» ITCH InformationTheoretic Cluster Hierarchies

» A Probabilistic Model Using Information Theoretic Measures for Cluster Ensembles

» Optimizing the CauchySchwarz PDF Distance for Information Theoretic Nonparametric Clusteri...

» Clustering disjoint subspaces via sparse representation

» An information theoretic approach for the discovery of irregular and repetitive patterns i...

Post Info
More Details (n/a)

Added	04 Jul 2010
Updated	04 Jul 2010
Type	Conference
Year	2003
Where	ICDM
Authors	Inderjit S. Dhillon, Yuqiang Guan

Comments (0)