Background: Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on infor...
Several problems in text categorization are too hard to be solved by standard bag-of-words representations. Work in kernel-based learning has approached this problem by (i) consid...
The construction of a text classifier usually involves (i) a phase of term selection, in which the most relevant terms for the classification task are identified, (ii) a phase ...
A number of feature selection mechanisms have been explored in text categorization, among which mutual information, information gain and chi-square are considered most effective. ...
Sanasam Ranbir Singh, Hema A. Murthy, Timothy A. G...
This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for ...