Approximated Clustering of Distributed High-Dimensional Data

16 years 3 days ago

Download www.dbs.ifi.lmu.de

In many modern application ranges high-dimensional feature vectors are used to model complex real-world objects. Often these objects reside on different local sites. In this paper, we present a general approach for extracting knowledge out of distributed data sets without transmitting all data from the local clients to a server site. In order to keep the transmission cost low, we first determine suitable local feature vector approximations which are sent to the server. Thereby, we approximate each feature vector as precisely as possible with a specified number of bytes. In order to extract knowledge out of these approximations, we introduce a suitable distance function between the feature vector approximations. In a detailed experimental evaluation, we demonstrate the benefits of our new feature vector approximation technique for the important area of distributed clustering. Thereby, we show that the combination of standard clustering algorithms and our feature vector approximation tec...

Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle,

Real-time Traffic

Data Mining | Feature Vector Approximation | Feature Vectors | PAKDD 2005 | Vector Approximation Technique |

claim paper

» Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications

» Clustering for Approximate Similarity Search in HighDimensional Spaces

» Optimal GridClustering Towards Breaking the Curse of Dimensionality in HighDimensional Clu...

» Subspace Clustering of High Dimensional Data

» Selectivity Estimation of High Dimensional Window Queries via Clustering

» Independence is Good DependencyBased Histogram Synopses for HighDimensional Data

» PAC Nearest Neighbor Queries Approximate and Controlled Search in HighDimensional and Metr...

» HighDimensional Similarity Search Using DataSensitive Space Partitioning

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	PAKDD
Authors	Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle, Matthias Renz

Comments (0)

Sciweavers

Approximated Clustering of Distributed High-Dimensional Data

Data Mining | Feature Vector Approximation | Feature Vectors | PAKDD 2005 | Vector Approximation Technique |

Explore & Download

Productivity Tools

Sciweavers