Hindi and Urdu share a common phonology, morphology and grammar but are written in different scripts. In addition, the vocabularies have also diverged significantly especially in ...
We propose a new statistical approach to extracting personal names from a corpus. One of the key points of our approach is that it can both automatically learn the characteristics...
We connect two scenarios in structured learning: adapting a parser trained on one corpus to another annotation style, and projecting syntactic annotations from one language to ano...
This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approac...
This paper describes the results of some experiments exploring statistical methods to infer syntactic categories from a raw corpus in an unsupervised fashion. It shares certain po...