Sciweavers

SDM
2003
SIAM

Scalable, Balanced Model-based Clustering

14 years 2 months ago
Scalable, Balanced Model-based Clustering
This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partitional, model-based clustering algorithms are viewed as an iterative two-step optimization process—iterative model re-estimation and sample re-assignment. Instead of a maximum-likelihood (ML) assignment, a balanceconstrained approach is used for the sample assignment step. An efficient iterative bipartitioning heuristic is developed to reduce the computational complexity of this step and make the balanced sample assignment algorithm scalable to large datasets. We demonstrate the superiority of this approach to regular ML clustering on complex data such as arbitraryshape 2-D spatial data, high-dimensional text documents, and EEG time series.
Shi Zhong, Joydeep Ghosh
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2003
Where SDM
Authors Shi Zhong, Joydeep Ghosh
Comments (0)