Quantifying the Limits and Success of Extractive Summarization Systems Across Domains

15 years 4 months ago

Download aclweb.org

This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents. We present a study that explores the summary space of each domain via an exhaustive search strategy, and finds the probability density function (pdf) of the ROUGE score distributions for each domain. We then use this pdf to calculate the percentile rank of extractive summarization systems. Our results introduce a new way to judge the success of automatic summarization systems and bring quantified explanations to questions such as why it was so hard for the systems to date to have a statistically significant improvement over the lead baseline in the news domain.

Hakan Ceylan, Rada Mihalcea, Umut O'zertem, Elena

Real-time Traffic

Automatic Text Summarization | Computational Linguistics | Extractive Summarization Systems | NAACL 2010 | Summarization Systems |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	NAACL
Authors	Hakan Ceylan, Rada Mihalcea, Umut O'zertem, Elena Lloret, Manuel Palomar

Comments (0)

Sciweavers

Quantifying the Limits and Success of Extractive Summarization Systems Across Domains

Automatic Text Summarization | Computational Linguistics | Extractive Summarization Systems | NAACL 2010 | Summarization Systems |

Explore & Download

Productivity Tools

Sciweavers