We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...
Hindi and Urdu share a common phonology, morphology and grammar but are written in different scripts. In addition, the vocabularies have also diverged significantly especially in ...
In this paper, we present CaptionEye/KE, a Korean to English machine translation system that is applied to a practical TV caption translation. And its experimental evaluation is p...
Seong-il Yang, Young Kil Kim, Young Ae Seo, Sung-K...
In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns...
While strings and syntax trees are used by the Natural Language Processing community to represent the structure of spoken languages, these encodings are difficult to adapt to a si...