This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (...
Abstract. We describe OpenMaTrEx, a free/open-source examplebased machine translation (EBMT) system based on the marker hypothesis, comprising a marker-driven chunker, a collection...
Sandipan Dandapat, Mikel L. Forcada, Declan Groves...
We compare two pivot strategies for phrase-based statistical machine translation (SMT), namely phrase translation and sentence translation. The phrase translation strategy means t...
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in th...
We present a new phrase-based conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentenc...