Clustering for metric and nonmetric distance measures

13 years 7 months ago

Download ls2-www.cs.uni-dortmund.de

We study a generalization of the k-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set P of size n, our goal is to find a set C of size k such that the sum of errors D(P, C) = pP mincC {D(p, c)} is minimized. The main result in this article can be stated as follows: There exists a (1 + )-approximation algorithm for the k-median problem with respect to D, if the 1-median problem can be approximated within a factor of (1 + ) by taking a random sample of constant size and solving the 1-median problem on the sample exactly. This algorithm requires time n2O(mklog(mk/ )) , where m is a constant that depends only on and D. Using this characterization, we obtain the first linear time (1 + )-approximation algorithms for the k-median problem in an arbitrary metric space with bounded doubling dimension, for the Kullback-Leibler divergence (relative entropy), for the Itakura-Saito divergence, for Mahalanobis distances, and for some special cases of Bregman diver...

Marcel R. Ackermann, Johannes Blömer, Christi

Real-time Traffic

1-median Problem | Algorithms | K-median Problem | Natural Language Processing | TALG 2010 |

claim paper

Post Info
More Details (n/a)

Added	21 May 2011
Updated	21 May 2011
Type	Journal
Year	2010
Where	TALG
Authors	Marcel R. Ackermann, Johannes Blömer, Christian Sohler

Comments (0)

Sciweavers

Clustering for metric and nonmetric distance measures

1-median Problem | Algorithms | K-median Problem | Natural Language Processing | TALG 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers