Summarization of Multi-Document Topic Hierarchies using Submodular Mixtures

10 years 3 months ago

Download melodi.ee.washington.edu

We study the problem of summarizing DAG-structured topic hierarchies over a given set of documents. Example applications include automatically generating Wikipedia disambiguation pages for a set of articles, and generating candidate multi-labels for preparing machine learning datasets (e.g., for text classiﬁcation, functional genomics, and image classiﬁcation). Unlike previous work, which focuses on clustering the set of documents using the topic hierarchy as features, we directly pose the problem as a submodular optimization problem on a topic hierarchy using the documents as features. Desirable properties of the chosen topics include document coverage, speciﬁcity, topic diversity, and topic homogeneity, each of which, we show, is naturally modeled by a submodular function. Other information, provided say by unsupervised approaches such as LDA and its variants, can also be utilized by deﬁning a submodular function that expresses coherence between the chosen topics and this in...

Ramakrishna Bairi, Rishabh K. Iyer, Ganesh Ramakri

Real-time Traffic

ACL 2015 | Computational Linguistics |

claim paper

» Latent Dirichlet Allocation and Singular Value Decomposition Based Multidocument Summariza...

» Supervised Evaluation of Dataset Partitions Advantages and Practice

Post Info
More Details (n/a)

Added	13 Apr 2016
Updated	13 Apr 2016
Type	Journal
Year	2015
Where	ACL
Authors	Ramakrishna Bairi, Rishabh K. Iyer, Ganesh Ramakrishnan, Jeff A. Bilmes

Comments (0)

Sciweavers

Summarization of Multi-Document Topic Hierarchies using Submodular Mixtures

ACL 2015 | Computational Linguistics |

Explore & Download

Productivity Tools

Sciweavers