Large amounts of low- to medium-quality English texts are now being produced by machine translation (MT) systems, optical character readers (OCR), and non-native speakers of Engli...
This paper describes a framework for defining domain specific Feature Functions in a user friendly form to be used in a Maximum Entropy Markov Model (MEMM) for the Named Entity Re...
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system...
Currently, a number of distributed software systems development tools exist, but typically they are designed either to satisfy industrial standards – industrial perspective – o...
Alcides Calsavara, Agnaldo K. Noda, Juarez da Cost...
Boosting algorithms build highly accurate prediction mechanisms from a collection of lowaccuracy predictors. To do so, they employ the notion of weak-learnability. The starting po...