Sciweavers

ACL
2007
14 years 9 days ago
Vocabulary Decomposition for Estonian Open Vocabulary Speech Recognition
Speech recognition in many morphologically rich languages suffers from a very high out-of-vocabulary (OOV) ratio. Earlier work has shown that vocabulary decomposition methods can ...
Antti Puurula, Mikko Kurimo
ACL
2008
14 years 9 days ago
Mining Parenthetical Translations from the Web by Word Alignment
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their translations in English inside a pair of parentheses. We present a method to extrac...
Dekang Lin, Shaojun Zhao, Benjamin Van Durme, Mari...
ACL
2008
14 years 9 days ago
Collecting a Why-Question Corpus for Development and Evaluation of an Automatic QA-System
Question answering research has only recently started to spread from short factoid questions to more complex ones. One significant challenge is the evaluation: manual evaluation i...
Joanna Mrozinski, Edward W. D. Whittaker, Sadaoki ...
ACL
2008
14 years 9 days ago
Correcting Misuse of Verb Forms
This paper proposes a method to correct English verb form errors made by non-native speakers. A basic approach is template matching on parse trees. The proposed method improves on...
John Lee, Stephanie Seneff
ACL
2007
14 years 9 days ago
Chinese Segmentation with a Word-Based Perceptron Algorithm
Standard approaches to Chinese word segmentation treat the problem as a tagging task, assigning labels to the characters in the sequence indicating whether the character marks a w...
Yue Zhang 0004, Stephen Clark
ACL
2008
14 years 9 days ago
Applying a Grammar-Based Language Model to a Simplified Broadcast-News Transcription Task
We propose a language model based on a precise, linguistically motivated grammar (a hand-crafted Head-driven Phrase Structure Grammar) and a statistical model estimating the proba...
Tobias Kaufmann, Beat Pfister
ACL
2007
14 years 9 days ago
A Simple, Similarity-based Model for Selectional Preferences
We propose a new, simple model for the automatic induction of selectional preferences, using corpus-based semantic similarity metrics. Focusing on the task of semantic role labeli...
Katrin Erk
ACL
2008
14 years 9 days ago
Multilingual Harvesting of Cross-Cultural Stereotypes
People rarely articulate explicitly what a native speaker of a language is already assumed to know. So to acquire the stereotypical knowledge that underpins much of what is said i...
Tony Veale, Yanfen Hao, Guofu Li
ACL
2007
14 years 9 days ago
Boosting Statistical Machine Translation by Lemmatization and Linear Interpolation
Data sparseness is one of the factors that degrade statistical machine translation (SMT). Existing work has shown that using morphosyntactic information is an effective solution t...
Ruiqiang Zhang, Eiichiro Sumita