This paper describes the creation of the first large-scale corpus containing drafts and final versions of essays written by non-native speakers, with the sentences aligned acros...
We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations l...
Oleksandr Kolomiyets, Steven Bethard, Marie-Franci...
Temporal reasoners for document understanding typically assume that a document’s creation date is known. Algorithms to ground relative time expressions and order events often re...
This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and i...
We propose several techniques for improving statistical machine translation between closely-related languages with scarce resources. We use character-level translation trained on ...
We investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneou...
We present a Bayesian nonparametric model for estimating tree insertion grammars (TIG), building upon recent work in Bayesian inference of tree substitution grammars (TSG) via Dir...
This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approac...
We demonstrate applications of psycholinguistic and sublexical information for learning Chinese characters. The knowledge about the grapheme-phoneme conversion (GPC) rules of lang...