In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used...
In recent years several models have been proposed for text categorization. Within this, one of the widely applied models is the vector space model (VSM), where independence betwee...
Using a ground truth extracted from the Wikipedia, and a ground truth created through manual assessment, we show that the apparent performance advantage seen in machine learning a...