Evaluating Evaluation Methods for Generation in the Presence of Variation

14 years 4 months ago

Download www.cs.cmu.edu

Recent years have seen increasing interest in automatic metrics for the evaluation of generation systems. When a system can generate syntactic variation, automatic evaluation becomes more difﬁcult. In this paper, we compare the performance of several automatic evaluation metrics using a corpus of automatically generated paraphrases. We show that these evaluation metrics can at least partially measure adequacy (similarity in meaning), but are not good measures of ﬂuency (syntactic correctness). We make several proposals for improving the evaluation of generation systems that produce variation.

Amanda Stent, Matthew Marge, Mohit Singhai

Real-time Traffic

Automatic Evaluation | CICLING 2005 | Evaluation | Evaluation Metrics | Natural Language Processing |

claim paper

Post Info
More Details (n/a)

Added	13 Oct 2010
Updated	13 Oct 2010
Type	Conference
Year	2005
Where	CICLING
Authors	Amanda Stent, Matthew Marge, Mohit Singhai

Comments (0)

Sciweavers

Evaluating Evaluation Methods for Generation in the Presence of Variation

Automatic Evaluation | CICLING 2005 | Evaluation | Evaluation Metrics | Natural Language Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers