Sciweavers

CLASSIFICATION
2010

Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

13 years 10 months ago
Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
: The issue of determining "the right number of clusters" in K-Means has attracted considerable interest, especially in the recent years. Cluster intermix appears to be a factor most affecting the clustering results. This paper proposes an experimental setting for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of between- and within-cluster spread to model cluster intermix. The setting allows for evaluating the centroid recovery on par with conventional evaluation of the cluster recovery. The subjects of our interest are two versions of the "intelligent" K-Means method, ik-Means, that find the "right" number of clusters by extracting "anomalous patterns" from the data one-by-one. We compare them with seven other methods, including Hartigan's rule, averaged Silhouette width and Gap statistic, under different between- and withincluster spread-shape conditions. There are several consis...
Mark Ming-Tso Chiang, Boris Mirkin
Added 01 Mar 2011
Updated 01 Mar 2011
Type Journal
Year 2010
Where CLASSIFICATION
Authors Mark Ming-Tso Chiang, Boris Mirkin
Comments (0)