Detailed knowledge about implemented concerns in the source code is crucial for the cost-effective maintenance and successful evolution of large systems. Concern mining techniques can automatically suggest sets of related code fragments that likely contribute to the implementation of a concern. However, developers must then spend considerable time understanding and expanding these concern seeds to obtain the full concern implementation. We propose a new mining technique (COMMIT) that reduces this manual effort. COMMIT addresses three major shortcomings of current concern mining techniques: 1) their inability to merge seeds with small variations, 2) their tendency to ignore important facets of concerns, and 3) their lack of information about the relations between seeds. A comparative case study on two large open source C systems (PostgreSQL and NetBSD) shows that COMMIT recovers up to 87.5% more unique concerns than two leading concern mining techniques, and that the three techniques c...
Bram Adams, Zhen Ming Jiang, Ahmed E. Hassan