Plagiarism Detection in arXiv

16 years 28 days ago

Download www.cs.cmu.edu

We describe a large-scale application of methods for ﬁnding plagiarism and self-plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efﬁciently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efﬁcient enough to implement as a real-time submission screen for a collection many times larger.

Daria Sorokina, Johannes Gehrke, Simeon Warner, Pa

Real-time Traffic

Data Mining | ICDM 2006 | Problematic Author Behaviors | Real-time Submission Screen | Research Document Collections |

claim paper

» Whos the Thief Automatic Detection of the Direction of Plagiarism

» Desktop Tools for Offline Plagiarism Detection in Computer Programs

» Intrinsic Plagiarism Detection

» Exploring Fingerprinting as External Plagiarism Detection Method Lab Report for PAN at CL...

» Corpus and Evaluation Measures for Automatic Plagiarism Detection

» Plagiarism A Survey

» Approaches for Intrinsic and External Plagiarism Detection Notebook for PAN at CLEF 2011

» External Plagiarism Detection Based on Standard IR Technology and Fast Recognition of Comm...

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	ICDM
Authors	Daria Sorokina, Johannes Gehrke, Simeon Warner, Paul Ginsparg

Comments (0)

Sciweavers

Plagiarism Detection in arXiv

Data Mining | ICDM 2006 | Problematic Author Behaviors | Real-time Submission Screen | Research Document Collections |

Explore & Download

Productivity Tools

Sciweavers