Sciweavers

KDD
2006
ACM

GPLAG: detection of software plagiarism by program dependence graph analysis

15 years 25 days ago
GPLAG: detection of software plagiarism by program dependence graph analysis
Along with the blossom of open source projects comes the convenience for software plagiarism. A company, if less self-disciplined, may be tempted to plagiarize some open source projects for its own products. Although current plagiarism detection tools appear sufficient for academic use, they are nevertheless short for fighting against serious plagiarists. For example, disguises like statement reordering and code insertion can effectively confuse these tools. In this paper, we develop a new plagiarism detection tool, called GPlag, which detects plagiarism by mining program dependence graphs (PDGs). A PDG is a graphic representation of the data and control dependencies within a procedure. Because PDGs are nearly invariant during plagiarism, GPlag is more effective than state-of-the-art tools for plagiarism detection. In order to make GPlag scalable to large programs, a statistical lossy filter is proposed to prune the plagiarism search space. Experiment study shows that GPlag is both ef...
Chao Liu 0001, Chen Chen, Jiawei Han, Philip S. Yu
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2006
Where KDD
Authors Chao Liu 0001, Chen Chen, Jiawei Han, Philip S. Yu
Comments (0)