Code clone management has been shown to have several benefits for software developers. When source code evolves, clone management requires a mechanism to efficiently and incrementally detect code clones in the new revision. This paper introduces an incremental clone detection tool, called ClemanX. Our tool represents code fragments as subtrees of Abstract Syntax Trees (ASTs), measures their similarity levels based on their characteristic vectors of structural features, and solves the task of incrementally detecting similar code as an incremental distance-based clustering problem. Our empirical evaluation on large-scale software projects shows the usefulness and good performance of ClemanX.
Tung Thanh Nguyen, Hoan Anh Nguyen, Jafar M. Al-Ko