This paper presents ongoing research in clinical information extraction. This work introduces a new genre of text which are not well-written, noise prone, ungrammatical and with m...
We investigate the automatic detection of sentences containing linguistic hedges using corpus statistics and syntactic patterns. We take Wikipedia as an already annotated corpus u...
In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance...
Hierarchical phrase-based models are attractive because they provide a consistent framework within which to characterize both local and long-distance reorderings, but they also ma...
Hendra Setiawan, Min-Yen Kan, Haizhou Li, Philip R...
We describe Joshua (Li et al., 2009a)1, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for translation via synchronou...
Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri ...
We developed caitra, a novel tool that aids human translators by (a) making suggestions for sentence completion in an interactive machine translation setting, (b) providing altern...
This paper presents a system for querying treebanks. The system consists of a powerful query language with natural support for cross-layer queries, a client interface with a graph...
This work describes an online application that uses Natural Language Generation (NLG) methods to generate walking directions in combination with dynamic 2D visualisation. We make ...