Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific purpose. Since ad-hoc manual translation c...
Prasanth Kolachina, Nicola Cancedda, Marc Dymetman...
We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine t...
We present a joint morphological-lexical language model (JMLLM) for use in statistical machine translation (SMT) of language pairs where one or both of the languages are morpholog...
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 t...
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz Jo...