Sciweavers

9 search results - page 1 / 2
» Word Length n-Grams for Text Re-use Detection
Sort
View
CICLING
2010
Springer
13 years 11 months ago
Word Length n-Grams for Text Re-use Detection
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Alberto Barrón-Cedeño, Chiara Basile...
ECIR
2009
Springer
14 years 4 months ago
On Automatic Plagiarism Detection Based on n-Grams Comparison
Abstract. When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the pla...
Alberto Barrón-Cedeño, Paolo Rosso
DRR
2009
13 years 5 months ago
Text-image alignment for historical handwritten documents
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text...
Svitlana Zinger, John Nerbonne, Lambert Schomaker
ICDAR
2007
IEEE
14 years 1 months ago
An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...
Michael Makridis, N. Nikolaou, Basilios Gatos
COLING
1996
13 years 8 months ago
The Automatic Extraction of Open Compounds from Text Corpora
This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...
Virach Sornlertlamvanich, Hozumi Tanaka