Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic simila...
Xiaodan Zhang, Liping Jing, Xiaohua Hu, Michael K....
In this research, a systematic study is conducted of four dimension reduction techniques for the text clustering problem, using five benchmark data sets. Of the four methods -- Ind...
Bin Tang, Michael A. Shepherd, Malcolm I. Heywood,...
In this paper, we examine how to improve the precision and recall of document clustering by utilizing meta-data. We use meta-data through NewsML tags to assist clustering and show...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...
In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...