Sciweavers

TSD
2001
Springer

Text Segmentation into Paragraphs Based on Local Text Cohesion

14 years 3 months ago
Text Segmentation into Paragraphs Based on Local Text Cohesion
The problem of automatic text segmentation is subcategorized into two different problems: thematic segmentation into rather large topically selfcontained sections and splitting into paragraphs, i.e., lexico-grammatical segmentation of lower level. In this paper we consider the latter problem. We propose a method of reasonably splitting text into paragraph based on a text cohesion measure. Specifically, we propose a method of quantitative evaluation of text cohesion based on a large linguistic resource – a collocation network. At each step, our algorithm compares word occurrences in a text against a large DB of collocations and semantic links between words in the given natural language. The procedure consists in evaluation of the cohesion function, its smoothing, normalization, and comparing with a specially constructed threshold.
Igor A. Bolshakov, Alexander F. Gelbukh
Added 30 Jul 2010
Updated 30 Jul 2010
Type Conference
Year 2001
Where TSD
Authors Igor A. Bolshakov, Alexander F. Gelbukh
Comments (0)