This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT). We make use of the collocation probabilities, which are estimated from monoli...
We propose a semantic tagger that provides high level concept information for phrases based on several kinds of low level information about words in clinical narrative texts. The ...
Word-based compression over natural language text has shown to be a good choice to trade compression ratio and speed, obtaining compression ratios close to 30% and very fast decom...
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
In this paper we examine the retrieval performance of adjacent and concurrent n-grams generated from polyphonic music data. We deploy a method to index polyphonic music using a wo...