Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text

13 years 10 months ago

Download users.dsic.upv.es

Traditionally, External Plagiarism Detection has been carried out by determining and measuring the similar sections between a given pair of documents, known as source and suspicious documents. One of the main difficulties of this task resides on the fact that not all similar text sections are examples of plagiarism, since thematic coincidences also tend to produce portions of common text. In order to face this problem in this paper we propose to represent the common (possibly reused) text by means of a set of features that denote its relevance and fragmentation. This new representation, used in conjunction with supervised learning algorithms, provides more elements for the automatic detection of document plagiarism; in particular, our experimental results show that it clearly outperformed the accuracy results achieved by traditional n-gram based approaches.

Fernando Sánchez-Vega, Luis Villaseñ

Real-time Traffic

Artificial Intelligence | External Plagiarism Detection | MICAI 2010 | Plagiarism | Similar Text Sections |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	MICAI
Authors	Fernando Sánchez-Vega, Luis Villaseñor Pineda, Manuel Montes-y-Gómez, Paolo Rosso

Comments (0)

Sciweavers

Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text

Artificial Intelligence | External Plagiarism Detection | MICAI 2010 | Plagiarism | Similar Text Sections |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers