Using an Evolving Thematic Clustering in a Text Segmentation Process

15 years 6 months ago

Download www.jucs.org

Abstract: The thematic text segmentation task consists in identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. We propose in this paper an algorithm for linear text segmentation on general corpuses. It relies on an initial clustering of the sentences of the text. This preliminary partitioning provides a global view on the sentences relations existing in the text, considering the similarities in a group rather than individually. The method, so-called ClassStruggle, is based on the distribution of the occurrences of the members of each class. During the process, the clusters then evolve, by considering a notion of proximity and of layout in the text, in the aim to create groups that contain only sentences related to a same topic development. Finally, boundaries are created between sentences belonging to two different classes. First experimental results are promising, ClassStruggle appears to be very competitive compared with existing ...

Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,

Real-time Traffic

Important Thematic Breaks | JUCS 2008 | Text Segmentation | Text Segmentation Task |

claim paper

» Text Clustering on Latent Thematic Spaces Variants Strengths and Weaknesses

» Thematic Segment Retrieval Revisited

» Evolutionary Basic Notions for a Thematic Representation of General Knowledge

» Using Text Segmentation to Enhance the Cluster Hypothesis

» Adaptive Region Growing Color Segmentation for Text Using Irregular Pyramid

» The Usefulness of Conceptual Representation for the Identification of Semantic Variability...

» Evolving Better Stoplists for Document Clustering and Web Intelligence

» Segmentation and alignment of parallel text for statistical machine translation

Post Info
More Details (n/a)

Added	13 Dec 2010
Updated	13 Dec 2010
Type	Journal
Year	2008
Where	JUCS
Authors	Sylvain Lamprier, Tassadit Amghar, Bernard Levrat, Frédéric Saubion

Comments (0)

Sciweavers

Using an Evolving Thematic Clustering in a Text Segmentation Process

Important Thematic Breaks | JUCS 2008 | Text Segmentation | Text Segmentation Task |

Explore & Download

Productivity Tools

Sciweavers