Sciweavers

CIKM
2009
Springer

Text segmentation via topic modeling: an analytical study

14 years 6 months ago
Text segmentation via topic modeling: an analytical study
In this paper, the task of text segmentation is approached from a topic modeling perspective. We investigate the use of latent Dirichlet allocation (LDA) topic model to segment a text into semantically coherent segments. A major benefit of the proposed approach is that along with the segment boundaries, it outputs the topic distribution associated with each segment. This information is of potential use in applications like segment retrieval and discourse analysis. The new approach outperforms a standard baseline method and yields significantly better performance than most of the available unsupervised methods on a benchmark dataset. Categories and Subject Descriptors: I.5.4 [Pattern Recognition]: Applications - text processing General Terms: Algorithms, Experimentation, Performance
Hemant Misra, François Yvon, Joemon M. Jose
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where CIKM
Authors Hemant Misra, François Yvon, Joemon M. Jose, Olivier Cappé
Comments (0)