Sciweavers

ICDM
2009
IEEE

On K-Means Cluster Preservation Using Quantization Schemes

14 years 6 months ago
On K-Means Cluster Preservation Using Quantization Schemes
This work examines under what conditions compression methodologies can retain the outcome of clustering operations. We focus on the popular k-Means clustering algorithm and we demonstrate how a properly constructed compression scheme based on post-clustering quantization is capable of maintaining the global cluster structure. Our analytical derivations indicate that a 1-bit moment preserving quantizer per cluster is sufficient to retain the original data clusters. Merits of the proposed compression technique include: a) reduced storage requirements with clustering guarantees, b) data privacy on the original values, and c) shape preservation for data visualization purposes. We evaluate quantization scheme on various high-dimensional datasets, including 1-dimensional and 2-dimensional timeseries (shape datasets) and demonstrate the cluster preservation property. We also compare with previously proposed simplification techniques in the time-series area and show significant improvement...
Deepak S. Turaga, Michail Vlachos, Olivier Versche
Added 23 May 2010
Updated 23 May 2010
Type Conference
Year 2009
Where ICDM
Authors Deepak S. Turaga, Michail Vlachos, Olivier Verscheure
Comments (0)