A Hybrid Hierarchical Model for Multi-Document Summarization

15 years 4 months ago

Download www.eecs.berkeley.edu

Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summarization as a two step learning problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document clusters based on their latent characteristics using a hierarchical topic model. Then, using these scores, we train a regression model based on the lexical and structural characteristics of the sentences, and use the model to score sentences of new documents to form a summary. Our system advances current state-of-the-art improving ROUGE scores by 7%. Generated summaries are less redundant and more coherent based upon manual quality evaluations.

Asli Çelikyilmaz, Dilek Hakkani-Tur

Real-time Traffic

ACL 2010 | Computational Linguistics | Extractive Multi-document Summarization | Model | Regression Model |

claim paper

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	ACL
Authors	Asli Çelikyilmaz, Dilek Hakkani-Tur

Sciweavers

A Hybrid Hierarchical Model for Multi-Document Summarization

ACL 2010 | Computational Linguistics | Extractive Multi-document Summarization | Model | Regression Model |

Explore & Download

Productivity Tools

Sciweavers