Exploring Correlation Between ROUGE and Human Evaluation on Meeting Summaries

13 years 11 months ago

Download www.hlt.utdallas.edu

Abstract—Automatic summarization evaluation is very important to the development of summarization systems. In text summarization, ROUGE has been shown to correlate well with human evaluation when measuring match of content units. However, there are many characteristics of the multiparty meeting domain, which may pose potential problems to ROUGE. The goal of this paper is to examine how well the ROUGE scores correlate with human evaluation for extractive meeting summarization, and explore different meeting domain speciﬁc factors that have an impact on the correlation. More analysis than those in our previous work [1] has been conducted in this study. Our experiments show that generally the correlation between ROUGE and human evaluation is not great; however, when accounting for several unique meeting characteristics, such as disﬂuencies, speaker information, and stopwords in the ROUGE setting, better correlation can be achieved, especially on the system summaries. We also found th...

Feifan Liu, Yang Liu

Real-time Traffic

Abstract—Automatic Summarization Evaluation | Human Evaluation | Summarization Evaluation | TASLP 2010 |

claim paper

Post Info
More Details (n/a)

Added	30 Jan 2011
Updated	30 Jan 2011
Type	Journal
Year	2010
Where	TASLP
Authors	Feifan Liu, Yang Liu

Comments (0)

Sciweavers

Exploring Correlation Between ROUGE and Human Evaluation on Meeting Summaries

Abstract—Automatic Summarization Evaluation | Human Evaluation | Summarization Evaluation | TASLP 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers