Language resource quality is crucial in NLP. Many of the resources used are derived from data created by human beings out of an NLP context, especially regarding MT and reference ...
We examine correlations between native speaker judgements on automatically generated German text against automatic evaluation metrics. We look at a number of metrics from the MT a...
We describe a dataset containing 16,000 translations produced by four machine translation systems and manually annotated for quality by professional translators. This dataset can ...
Automatic evaluation of Machine Translation (MT) quality is essential to developing highquality MT systems. Various evaluation metrics have been proposed, and BLEU is now used as ...
Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhit...
We illustrate and explain problems of n-grams-based machine translation (MT) metrics (e.g. BLEU) when applied to morphologically rich languages such as Czech. A novel metric SemPO...