Sciweavers

COLT
2008
Springer

Finding Metric Structure in Information Theoretic Clustering

14 years 1 months ago
Finding Metric Structure in Information Theoretic Clustering
We study the problem of clustering discrete probability distributions with respect to the Kullback-Leibler (KL) divergence. This problem arises naturally in many applications. Our goal is to pick k distributions as "representatives" such that the average or maximum KLdivergence between an input distribution and the closest representative distribution is minimized. Unfortunately, no polynomial-time algorithms with worst-case performance guarantees are known for either of these problems. The analogous problems for l1, l2 and l2 2 (i.e., k-center, k-median and k-means) have been extensively studied and efficient algorithms with good approximation guarantees are known. However, these algorithms rely crucially on the (geo-)metric properties of these metrics and do not apply to KL-divergence. In this paper, our contribution is to find a "relaxed" metricstructure for KL-divergence. In doing so, we provide the first polynomial-time algorithm for clustering using KL-diverge...
Kamalika Chaudhuri, Andrew McGregor
Added 18 Oct 2010
Updated 18 Oct 2010
Type Conference
Year 2008
Where COLT
Authors Kamalika Chaudhuri, Andrew McGregor
Comments (0)