Abstract. Structure-oriented approaches in clone detection have become popular in both code-based and model-based clone detection. However, existing methods for capturing structural information in software artifacts are either too computationally expensive to be efficient or too light-weight to be accurate in clone detection. In this paper, we present Exas, an accurate and efficient structural characteristic feature extraction approach that better approximates and captures the structure within the fragments of artifacts. Exas structural features are the sequences of labels and numbers built from nodes, edges, and paths of various lengths of a graph-based representation. A fragment is characterized by a structural characteristic vector of the occurrence counts of those features. We have applied Exas in building two clone detection tools for source code and models. Our analytic study and empirical evaluation on open-source software show that Exas and its algorithm for computing the chara...
Hoan Anh Nguyen, Tung Thanh Nguyen, Nam H. Pham, J