As the age of software systems increases they tend to deviate from their actual design and architecture. It becomes more and more difficult to manage and maintain such systems. We explore the idea of software clustering for reverse engineering and re-modularization. Clustering together software artifacts provides an c technique for discovering high level abstract entities within a system. Previous work on software clustering has identified many areas where further investigation is required. Clustering techniques should be tuned to the type of system they are being applied to. In this paper we explore a new clustering algorithm called the ‘combined’ algorithm which, as our experiments show, provides more promising results for software clustering than the previously used algorithms. We also analyze the behavior of correlation and distance metrics for binary features.
M. Saeed, Onaiza Maqbool, Haroon A. Babri, Syed Za