When translating among languages that differ substantially in word order, machine translation (MT) systems benefit from syntactic preordering—an approach that uses features fro...
We present a general methodology for extracting multi-word expressions (of various types), along with their translations, from small parallel corpora. We automatically align the p...
Previous work has shown that high quality phrasal paraphrases can be extracted from bilingual parallel corpora. However, it is not clear whether bitexts are an appropriate resourc...
Juri Ganitkevitch, Chris Callison-Burch, Courtney ...
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
Aligned corpora are often-used resources in CLIR systems. The three qualities of translation corpora that most dramatically affect the performance of a corpus-based CLIR system are...