Bilingual lexicons are fundamental resources. Modern automated lexicon generation methods usually require parallel corpora, which are not available for most language pairs. Lexico...
In this paper, we study the problem of using an annotated corpus in English for the same natural language processing task in another language. While various machine translation sy...
We define a probabilistic morphological analyzer using a data-driven approach for Syriac in order to facilitate the creation of an annotated corpus. Syriac is an under-resourced S...
Peter McClanahan, George Busby, Robbie Haertel, Kr...
A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages h...
Daniel Ramage, David Hall, Ramesh Nallapati, Chris...
When research articles introduce new findings or concepts they typically relate them only to knowledge and domain concepts of immediate relevance. However, many domain concepts re...