

A New Clustering Algorithm Based on Regions of Influence with Self-Detection of the Best Number of Clusters

14 years 19 days ago
A New Clustering Algorithm Based on Regions of Influence with Self-Detection of the Best Number of Clusters
Clustering methods usually require to know the best number of clusters, or another parameter, e.g. a threshold, which is not ever easy to provide. This paper proposes a new graph-based clustering method called "GBC" which detects automatically the best number of clusters, without requiring any other parameter. In this method based on regions of influence, a graph is constructed and the edges of the graph having the higher values are cut according to a hierarchical divisive procedure. An index is calculated from the size average of the cut edges which self-detects the more appropriate number of clusters. The results of GBC for 3 quality indices (Dunn, Silhouette and Davies-Bouldin) are compared with those of K-Means, Ward's hierarchical clustering method and DBSCAN on 8 benchmarks. The experiments show the good performance of GBC in the case of well separated clusters, even if the data are unbalanced, non-convex or with presence of outliers, whatever the shape of the clus...
Fabrice Muhlenbach, Stéphane Lallich
Added 18 Feb 2011
Updated 18 Feb 2011
Type Journal
Year 2009
Where ICDM
Authors Fabrice Muhlenbach, Stéphane Lallich
Comments (0)