Unsupervised Learning with Term Clustering for Thematic Segmentation of Texts

15 years 8 months ago

Download www-connex.lip6.fr

In this paper we introduce a machine learning approach for automatic text segmentation. Our text segmenter clusters text-segments containing similar concepts. It first discovers the different concepts present in a text, each concept being defined as a set of representative terms. After that the text is partitioned into coherent paragraphs using a clustering technique based on the Classification Maximum Likelihood approach. We evaluate the effectiveness of this technique on sets of concatenated paragraphs from two collections, the 7sectors and the 20 Newsgroups corpus, and compare it to a baseline text segmentation technique proposed by Salton et al.

Marc Caillet, Jean-François Pessiot, Massih

Real-time Traffic

Automatic Text Segmentation | RIAO 2004 | RIAO 2007 | Text Segmentation | Text Segmentation Technique |

claim paper

» Multitask text segmentation and alignment based on weighted mutual information

» Object Segmentation by Long Term Analysis of Point Trajectories

» The structure of verbal sequences analyzed with unsupervised learning techniques

» Automatic unsupervised parameter selection for character segmentation

» Acquiring DomainSpecific Dialog Information from TaskOriented HumanHuman Interaction throu...

» Unsupervised Acquiring of Morphological Paradigms from Tokenized Text

» Discovery of numerous specific topics via term cooccurrence analysis

» Identification of class specific discourse patterns

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	RIAO
Authors	Marc Caillet, Jean-François Pessiot, Massih-Reza Amini, Patrick Gallinari

Comments (0)

Sciweavers

Unsupervised Learning with Term Clustering for Thematic Segmentation of Texts

Automatic Text Segmentation | RIAO 2004 | RIAO 2007 | Text Segmentation | Text Segmentation Technique |

Explore & Download

Productivity Tools

Sciweavers