We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
We propose a structure called dependency forest for statistical machine translation. A dependency forest compactly represents multiple dependency trees. We develop new algorithms ...
Zhaopeng Tu, Yang Liu, Young-Sook Hwang, Qun Liu, ...
We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine t...
We present a joint morphological-lexical language model (JMLLM) for use in statistical machine translation (SMT) of language pairs where one or both of the languages are morpholog...