Chinese characters that are similar in their pronunciations or in their internal structures are useful for computer-assisted language learning and for psycholinguistic studies. Al...
A process that attempts to solve abbreviation ambiguity is presented. Various contextrelated features and statistical features have been explored. Almost all features are domain i...
Given several systems' automatic translations of the same sentence, we show how to combine them into a confusion network, whose various paths represent composite translations...
Damianos Karakos, Jason Eisner, Sanjeev Khudanpur,...
Natural Language Processing (NLP) for Information Retrieval has always been an interesting and challenging research area. Despite the high expectations, most of the results indica...
Underspecification-based algorithms for processing partially disambiguated discourse structure must cope with extremely high numbers of readings. Based on previous work on dominan...
We investigate elaborative summarisation, where the aim is to identify supplementary information that expands upon a key fact. We envisage such summaries being useful when browsin...
We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitl...
Ryan Roth, Owen Rambow, Nizar Habash, Mona T. Diab...
This paper introduces a new method for identifying named-entity (NE) transliterations within bilingual corpora. Current state-of-theart approaches usually require annotated data a...
Splitting compound words has proved to be useful in areas such as Machine Translation, Speech Recognition or Information Retrieval (IR). Furthermore, real-time IR systems (such as...