This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 t...
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz Jo...
Automatic speech recognition (ASR) results contain not only ASR errors, but also disfluencies and colloquial expressions that must be corrected to create readable transcripts. We...
Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya ...
It is possible to reduce the bulk of phrasetables for Statistical Machine Translation using a technique based on the significance testing of phrase pair co-occurrence in the para...
Howard Johnson, Joel D. Martin, George F. Foster, ...
In this paper we describe recent improvements to components and methods used in our statistical machine translation system for ChineseEnglish used in the January 2008 GALE evaluat...
Almut Silja Hildebrand, Kay Rottmann, Mohamed Noam...
This paper presents a direct word reordering model with novel syntax-based features for statistical machine translation. Reordering models address the problem of reordering source...