Sciweavers

FECS
2006

A Lightweight Program Similarity Detection Model using XML and Levenshtein Distance

14 years 1 months ago
A Lightweight Program Similarity Detection Model using XML and Levenshtein Distance
Program plagiarism is one of the most significant problems in Computer Science education. Most common plagiarism includes modifying comments, reordering statements, and changing variable names. Such simple changes, however, require excessive string comparisons. This paper presents a lightweight program similarity detection model. Unlike other detection models, our model avoids globally involved string comparisons. String matching is only involved locally when comparing control sequences. To this end we use XML and Levenshtein distance algorithm. The XML's tree-like representation reduces intensive string comparisons for the simple modifications. Levenshtein distance algorithm makes our model reliable for logic changes. Our approach is based on the XPDec model and is capable of distinguishing a flat structure from a nested structure of control sequences. Such improvement will lead to simple and reliable implementation of program similarity detection systems.
Seo-Young Noh, Sangwoo Kim, Cheonyoung Jung
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2006
Where FECS
Authors Seo-Young Noh, Sangwoo Kim, Cheonyoung Jung
Comments (0)