Clustering is a prominent method in the data mining field. It is a discovery process that groups data such that intra cluster similarity is maximized and the inter cluster similarity is minimized. Clustering has been widely used in a variety of areas and many clustering algorithms have been developed in response. Almost every report emphasizes differences and ignores similarities among algorithms. This is true in general and specifically for the algorithms of central concern in this paper: agglomerative hierarchical ones. The principal view adopted here is that improved clustering quality can be achieved through exploiting commonalties among methods, e.g., considerations relating to merging clusters and criteria for it, e.g., single link merging (SLINK, OPTICS); edge cut merging (CHAMELEON, ROCK); and criteria based on the square of the adjacency matrix (OPTICS, ROCK). MABAC (matrix based clustering), a proposed algorithm, introduces a goodness function based on notions of link and in...
Yonghui Chen, Alan P. Sprague, Kevin D. Reilly