Sciweavers

SIGIR
2003
ACM

Document clustering based on non-negative matrix factorization

14 years 4 months ago
Document clustering based on non-negative matrix factorization
In this paper, we propose a novel document clustering method based on the non-negative factorization of the termdocument matrix of the given document corpus. In the latent semantic space derived by the non-negative matrix factorization (NMF), each axis captures the base topic of a particular document cluster, and each document is represented as an additive combination of the base topics. The cluster membership of each document can be easily determined by finding the base topic (the axis) with which the document has the largest projection value. Our experimental evaluations show that the proposed document clustering method surpasses the latent semantic indexing and the spectral clustering methods not only in the easy and reliable derivation of document clustering results, but also in document clustering accuracies. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Clustering General Terms Algorithms Keywords Document Clust...
Wei Xu, Xin Liu, Yihong Gong
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where SIGIR
Authors Wei Xu, Xin Liu, Yihong Gong
Comments (0)