Structured link vector model (SLVM) is a recently proposed document representation that takes into account both structural and semantic information for measuring XML document simi...
Cluster label quality is crucial for browsing topic hierarchies obtained via document clustering. Intuitively, the hierarchical structure should influence the labeling accuracy. H...
We study dimensionality reduction or feature selection in text document categorization problem. We focus on the first step in building text categorization systems, that is the cho...
In a higher level task such as clustering of web results or word sense disambiguation, knowledge of all possible distinct concepts in which an ambiguous word can be expressed woul...
Multiple Instance Learning (MIL) is a special kind of supervised learning problem that has been studied actively in recent years. We propose an approach based on One-Class Support ...