Sciweavers

111 search results - page 16 / 23
» Word Segmentation of Vietnamese Texts: a Comparison of Appro...
Sort
View
CLEF
2010
Springer
13 years 9 months ago
External and Intrinsic Plagiarism Detection Using a Cross-Lingual Retrieval and Segmentation System - Lab Report for PAN at CLEF
We present our hybrid system for the PAN challenge at CLEF 2010. Our system performs plagiarism detection for translated and non-translated externally as well as intrinsically plag...
Markus Muhr, Roman Kern, Mario Zechner, Michael Gr...
ACL
2006
13 years 9 months ago
Maximum Entropy Based Restoration of Arabic Diacritics
Short vowels and other diacritics are not part of written Arabic scripts. Exceptions are made for important political and religious texts and in scripts for beginning students of ...
Imed Zitouni, Jeffrey S. Sorensen, Ruhi Sarikaya
FUIN
2006
98views more  FUIN 2006»
13 years 8 months ago
On-line Approximate String Matching in Natural Language
We consider approximate pattern matching in natural language text. We use the words of the text as the alphabet, instead of the characters as in traditional string matching approac...
Kimmo Fredriksson
ACL
2010
13 years 6 months ago
Event-Based Hyperspace Analogue to Language for Query Expansion
Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and...
Tingxu Yan, Tamsin Maxwell, Dawei Song, Yuexian Ho...
ICDAR
1997
IEEE
14 years 7 days ago
Representing OCRed documents in HTML
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Tao Hong, Sargur N. Srihari