Extensive and deep paraphrase corpora are important for a variety of natural language processing and user interaction tasks. In this paper, we present an approach which i) collect...
The paper describes a particular approach to multiengine machine translation (MEMT), where we make use of voted language models to selectively combine translation outputs from mul...
Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics,...
A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages h...
Daniel Ramage, David Hall, Ramesh Nallapati, Chris...
Abstract. Morphological knowledge (inflection, derivation, compounds) is useful for medical language processing. Some is available for medical English in the UMLS Specialist Lexic...