Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations

15 years 8 months ago

Download www.lrec-conf.org

We describe a focused effort to investigate the performance of phrase-based, human evaluation of machine translation output achieving a high annotator agreement. We define phrase-based evaluation and describe the implementation of Appraise, a toolkit that supports the manual evaluation of machine translation results. Phrase ranking can be done using either a fine-grained six-way scoring scheme that allows to differentiate between "much better" and "slightly better", or a reduced subset of ranking choices. Afterwards we discuss values for both scoring models from several experiments conducted with human annotators. Our results show that phrase-based evaluation can be used for fast evaluation obtaining significant agreement among annotators. The granularity of ranking choices should, however, not be too fine-grained as this seems to confuse annotators and thus reduces the overall agreement. The work reported in this paper confirms previous work in the field and illus...

Christian Federmann

Real-time Traffic

Education | Human Evaluation | LREC 2010 | Machine Translation | Phrase-based Evaluation |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Christian Federmann

Comments (0)

Sciweavers

Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations

Education | Human Evaluation | LREC 2010 | Machine Translation | Phrase-based Evaluation |

Explore & Download

Productivity Tools

Sciweavers