Author identification models fall into two major categories according to the way they handle the training texts: profile-based models produce one representation per author while in...
Wikipedia vandalism identification is a very complex issue, which is now mostly solved manually by volunteers. This paper presents the main components of a system built by our grou...
Plagiarism detection has been considered as a classification problem which can be approximated with intrinsic strategies, considering self-based information from a given document,...
Gabriel Oberreuter, Gaston L'Huillier, Sebasti&aac...
Wikipedia despite having a very small budget has been among the top ten most visited websites for over half a decade. Being this visible also generated the problem of ill intended ...
Abstract In this paper, we describe a novel approach to intrinsic plagiarism detection. Each suspicious document is divided into a series of consecutive, potentially overlapping ...