Sciweavers

COLING
2002

Hierarchical Orderings of Textual Units

14 years 8 days ago
Hierarchical Orderings of Textual Units
Text representation is a central task for any approach to automatic learning from texts. It requires a format which allows to interrelate texts even if they do not share content words, but deal with similar topics. Furthermore, measuring text similarities raises the question of how to organize the resulting clusters. This paper presents cohesion trees (CT) as a data structure for the perspective, hierarchical organization of text corpora. CTs operate on alternative text representation models taking lexical organization, quantitative text characteristics, and text structure into account. It is shown that CTs realize text linkages which are lexically more homogeneous than those produced by minimal spanning trees.
Alexander Mehler
Added 17 Dec 2010
Updated 17 Dec 2010
Type Journal
Year 2002
Where COLING
Authors Alexander Mehler
Comments (0)