Source code version repositories provide a treasure of information encompassing the changes introduced in the system throughout its evolution. These repositories are typically managed by tools such as CVS. However, these tools identify and express changes in terms of physical attributes i.e., file and line numbers. Recently, to help support the mining of software repositories (MSR), researchers have proposed methods to derive and express changes from source code repositories in a more source-code “aware” manner (i.e., syntax and semantic). Here, we discuss these MSR techniques in light of what changes are identified, how they are expressed, the adopted methodology, evaluation, and results. This work forms the basis for a taxonomic description of MSR approaches. Categories and Subject Descriptors D.2.7. [Software Engineering]: Distribution, Maintenance, and Enhancement – documentation, enhancement, extensibility, version control General Terms Management, Experimentation Keywords ...
Huzefa H. Kagdi, Michael L. Collard, Jonathan I. M