We present an approach to MT between Turkic languages and present results from an implementation of a MT system from Turkmen to Turkish. Our approach relies on ambiguous lexical a...
Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneous...
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
This paper extends the training and tuning regime for phrase-based statistical machine translation to obtain fluent translations into morphologically complex languages (we build ...
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...