Sciweavers

ACL
2012

PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

12 years 1 months ago
PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judgment than BLEU. In principle, tuning on these metrics should yield better systems than tuning on BLEU. However, due to issues such as speed, requirements for linguistic resources, and optimization difficulty, they have not been widely adopted for tuning. This paper presents PORT1 , a new MT evaluation metric which combines precision, recall and an ordering metric and which is primarily designed for tuning MT systems. PORT does not require external resources and is quick to compute. It has a better correlation with human judgment than BLEU. We compare PORT-tuned MT systems to BLEU-tuned baselines in five experimental conditions involving four language pairs. PORT tuning achieves consistently better performance than BLEU tuning, according to four automated metrics (including BLEU) and to human evaluation: in comparisons of outputs from 300 source sentences, human judges preferred the PORT...
Boxing Chen, Roland Kuhn, Samuel Larkin
Added 29 Sep 2012
Updated 29 Sep 2012
Type Journal
Year 2012
Where ACL
Authors Boxing Chen, Roland Kuhn, Samuel Larkin
Comments (0)