Sciweavers

STACS
2007
Springer

Small Space Representations for Metric Min-Sum k -Clustering and Their Applications

14 years 6 months ago
Small Space Representations for Metric Min-Sum k -Clustering and Their Applications
The min-sum k-clustering problem is to partition a metric space (P, d) into k clusters C1, . . . , Ck ⊆ P such that k i=1 p,q∈Ci d(p, q) is minimized. We show the first efficient construction of a coreset for this problem. Our coreset construction is based on a new adaptive sampling algorithm. With our construction of coresets we obtain three main algorithmic results. The first result is a sublinear time (4 + ǫ)-approximation algorithm for the min-sum k-clustering problem in metric spaces. The running time of this algorithm is O(n) for any constant k and ǫ, and it is o(n2 ) for all k = o(log n/ log log n). Since the full description size of the input is Θ(n2 ), this is sublinear in the input size. The fastest previously known o(log n)-factor approximation algorithm for k > 2 achieved a running time of Ω(nk ), and no non-trivial o(n2 )-time algorithm was known before. Our second result is the first pass-efficient data streaming algorithm for min-sum k-clustering in the...
Artur Czumaj, Christian Sohler
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where STACS
Authors Artur Czumaj, Christian Sohler
Comments (0)