Sciweavers

700 search results - page 4 / 140
» Language Model Based Arabic Word Segmentation
Sort
View
NAACL
2010
13 years 5 months ago
Automatic Diacritization for Low-Resource Languages Using a Hybrid Word and Consonant CMM
We are interested in diacritizing Semitic languages, especially Syriac, using only diacritized texts. Previous methods have required the use of tools such as part-of-speech tagger...
Robbie Haertel, Peter McClanahan, Eric K. Ringger
ACL
2008
13 years 8 months ago
Lexicalized Phonotactic Word Segmentation
This paper presents a new unsupervised algorithm (WordEnds) for inferring word boundaries from transcribed adult conversations. Phone ngrams before and after observed pauses are u...
Margaret M. Fleck
ACL
2006
13 years 8 months ago
Maximum Entropy Based Restoration of Arabic Diacritics
Short vowels and other diacritics are not part of written Arabic scripts. Exceptions are made for important political and religious texts and in scripts for beginning students of ...
Imed Zitouni, Jeffrey S. Sorensen, Ruhi Sarikaya
ICDAR
2009
IEEE
13 years 5 months ago
Recognition-Based Segmentation Algorithm for On-Line Arabic Handwriting
In this paper, we introduce an on-line Arabic handwritten recognition system based on new stroke segmentation algorithm. The proposed algorithm uses an over segmentation method th...
Khaled Daifallah, Nizar Zarka, Hassan Jamous
COLING
2010
13 years 2 months ago
Nonparametric Word Segmentation for Machine Translation
We present an unsupervised word segmentation model for machine translation. The model uses existing monolingual segmentation techniques and models the joint distribution over sour...
ThuyLinh Nguyen, Stephan Vogel, Noah A. Smith