Sciweavers

128 search results - page 7 / 26
» Automatic Sense Tagging Using Parallel Corpora
Sort
View
ICCPOL
2009
Springer
14 years 20 days ago
Constructing Parallel Corpus from Movie Subtitles
Abstract. This paper describes a methodology for constructing aligned German-Chinese corpora from movie subtitles. The corpora will be used to train a special machine translation s...
Han Xiao, Xiaojie Wang
ACL
2007
13 years 9 months ago
Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario
This paper describes our work on building Part-of-Speech (POS) tagger for Bengali. We have use Hidden Markov Model (HMM) and Maximum Entropy (ME) based stochastic taggers. Bengali...
Sandipan Dandapat, Sudeshna Sarkar, Anupam Basu
NAACL
2010
13 years 6 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova
IPM
2006
171views more  IPM 2006»
13 years 8 months ago
Automatic extraction of bilingual word pairs using inductive chain learning in various languages
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system...
Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi
COLING
2008
13 years 9 months ago
OntoNotes: Corpus Cleanup of Mistaken Agreement Using Word Sense Disambiguation
Annotated corpora are only useful if their annotations are consistent. Most large-scale annotation efforts take special measures to reconcile inter-annotator disagreement. To date...
Liang-Chih Yu, Chung-Hsien Wu, Eduard H. Hovy