This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
In order to search corpora written in two or more languages, the simplest and most efficient approach is to translate the query submitted into the required language(s). To achieve...
This paper introduces deep syntactic structures to syntax-based Statistical Machine Translation (SMT). We use a Head-driven Phrase Structure Grammar (HPSG) parser to obtain the de...
We present a novel paradigm for statistical machine translation (SMT), based on a joint modeling of word alignment and the topical aspects underlying bilingual document-pairs, via...
Hindi and Urdu share a common phonology, morphology and grammar but are written in different scripts. In addition, the vocabularies have also diverged significantly especially in ...