Sciweavers

PAKDD
2005
ACM

Approximated Clustering of Distributed High-Dimensional Data

14 years 5 months ago
Approximated Clustering of Distributed High-Dimensional Data
In many modern application ranges high-dimensional feature vectors are used to model complex real-world objects. Often these objects reside on different local sites. In this paper, we present a general approach for extracting knowledge out of distributed data sets without transmitting all data from the local clients to a server site. In order to keep the transmission cost low, we first determine suitable local feature vector approximations which are sent to the server. Thereby, we approximate each feature vector as precisely as possible with a specified number of bytes. In order to extract knowledge out of these approximations, we introduce a suitable distance function between the feature vector approximations. In a detailed experimental evaluation, we demonstrate the benefits of our new feature vector approximation technique for the important area of distributed clustering. Thereby, we show that the combination of standard clustering algorithms and our feature vector approximation tec...
Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle,
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where PAKDD
Authors Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle, Matthias Renz
Comments (0)