The distortion cost function used in Mosesstyle machine translation systems has two flaws. First, it does not estimate the future cost of known required moves, thus increasing sea...
Spence Green, Michel Galley, Christopher D. Mannin...
Abstract. This paper describes a methodology for constructing aligned German-Chinese corpora from movie subtitles. The corpora will be used to train a special machine translation s...
When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful intbrmation. In this paper, we present a method f...
A Bloom filter (BF) is a randomised data structure for set membership queries. Its space requirements are significantly below lossless information-theoretic lower bounds but it ...
Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual a...