Statistical machine translation (SMT) requires a large parallel corpus, which is available only for restricted language pairs and domains. To expand the language pairs and domains...
Recently system combination has been shown to be an effective way to improve translation quality over single machine translation systems. In this paper, we present a simple and ef...
Enriching a pronunciation dictionary with phonological variation is a challenging task, not yet solved despite several decades of research, in particular for speech-to-text transc...
Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...
Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...
We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based m...
Chris Dyer, Adam Lopez, Juri Ganitkevitch, Jonatha...