We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...
A great challenge for web site designers is how to ensure users' easy access to important web pages efficiently. In this paper we present a clustering-based approach to addres...
Zhong Su, Qiang Yang, HongJiang Zhang, Xiaowei Xu,...
In this paper, we examine how to improve the precision and recall of document clustering by utilizing meta-data. We use meta-data through NewsML tags to assist clustering and show...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...