Sciweavers

TALG
2010

Clustering for metric and nonmetric distance measures

13 years 7 months ago
Clustering for metric and nonmetric distance measures
We study a generalization of the k-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set P of size n, our goal is to find a set C of size k such that the sum of errors D(P, C) = pP mincC {D(p, c)} is minimized. The main result in this article can be stated as follows: There exists a (1 + )-approximation algorithm for the k-median problem with respect to D, if the 1-median problem can be approximated within a factor of (1 + ) by taking a random sample of constant size and solving the 1-median problem on the sample exactly. This algorithm requires time n2O(mklog(mk/ )) , where m is a constant that depends only on and D. Using this characterization, we obtain the first linear time (1 + )-approximation algorithms for the k-median problem in an arbitrary metric space with bounded doubling dimension, for the Kullback-Leibler divergence (relative entropy), for the Itakura-Saito divergence, for Mahalanobis distances, and for some special cases of Bregman diver...
Marcel R. Ackermann, Johannes Blömer, Christi
Added 21 May 2011
Updated 21 May 2011
Type Journal
Year 2010
Where TALG
Authors Marcel R. Ackermann, Johannes Blömer, Christian Sohler
Comments (0)