Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. ...
The distortion cost function used in Mosesstyle machine translation systems has two flaws. First, it does not estimate the future cost of known required moves, thus increasing sea...
Spence Green, Michel Galley, Christopher D. Mannin...
Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of th...
Traditional 1-best translation pipelines suffer a major drawback: the errors of 1best outputs, inevitably introduced by each module, will propagate and accumulate along the pipeli...
A Bloom filter (BF) is a randomised data structure for set membership queries. Its space requirements are significantly below lossless information-theoretic lower bounds but it ...