Training statistical models to detect nonnative sentences requires a large corpus of non-native writing samples, which is often not readily available. This paper examines the exte...
This paper describes an efficient method to extract large n-best lists from a word graph produced by a statistical machine translation system. The extraction is based on the k sh...
In the framework of the Tc-Star project, we analyze and propose a combination of two Statistical Machine Translation systems: a phrase-based and an N-gram-based one. The exhaustiv...
Dependency analysis of natural language gives rise to non-projective structures. The constraint of well-nestedness on dependency trees has been recently shown to give a good fit ...
We present a method for utilizing unannotated sentences to improve a semantic parser which maps natural language (NL) sentences into their formal meaning representations (MRs). Gi...
This paper describes a new grapheme-tophoneme framework, based on a combination of formal linguistic and statistical methods. A context-free grammar is used to parse words into th...
We present a method for automatic determiner selection, based on an existing language model. We train on the Penn Treebank and also use additional data from the North American New...
This paper presents a three-step dependency parser to parse Chinese deterministically. By dividing a sentence into several parts and parsing them separately, it aims to reduce the...
We present results from a new Interagency Language Roundtable (ILR) based comprehension test. This new test design presents questions at multiple ILR difficulty levels within each...
Douglas Jones, Martha Herzog, Hussny Ibrahim, Arvi...