This paper presents a new approach for automatic document categorization. Exploiting the logical structure of the document, our approach assigns a HTML document to one or more cate...
It is difficult to identify sentence importance from a single point of view. In this paper, we propose a learning-based approach to combine various sentence features. They are cat...
The work described here builds on [1], where we presented a categorisation of norms or provisions in legislation. We claimed that the categories are characterized by the use of ty...
Since sentences are the basic propositional units of text, knowing their themes should help various tasks requiring the knowledge about the semantic content of text. In this paper,...
With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text exc...
Gerard Salton, Amit Singhal, Chris Buckley, Mandar...