This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 t...
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz Jo...
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in th...
In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to ...
We describe the use of energy function optimisation in very shallow syntactic parsing. The approach can use linguistic rules and corpus-based statistics, so the strengths of both ...
We propose several techniques for improving statistical machine translation between closely-related languages with scarce resources. We use character-level translation trained on ...