To aid software analysis and maintenance tasks, a number of software clustering algorithms have been proposed to automatically partition a software system into meaningful subsystems or clusters. However, it is unknown whether these algorithms produce similar meaningful clusterings for similar versions of a real-life software system under continual change and growth. This paper describes a comparative study of six software clustering algorithms. We applied each of the algorithms to subsequent versions from five large open source systems. We conducted comparisons based on three criteria respectively: stability (Does the clustering change only modestly as the system undergoes modest updating?), authoritativeness (Does the clustering reasonably approximate the structure an authority provides?) and extremity of cluster distribution (Does the clustering avoid huge clusters and many very small clusters?). Experimental results indicate that the studied algorithms exhibit distinct characteris...
Jingwei Wu, Ahmed E. Hassan, Richard C. Holt