Sciweavers

328 search results - page 13 / 66
» A Multi-level Approach for Document Clustering
Sort
View
ECIR
2008
Springer
13 years 9 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
14 years 8 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
CSDA
2006
85views more  CSDA 2006»
13 years 7 months ago
Two-way Poisson mixture models for simultaneous document classification and word clustering
An approach to simultaneous document classification and word clustering is developed using a two-way mixture model of Poisson distributions. Each document is represented by a vect...
Jia Li, Hongyuan Zha
SIGIR
2002
ACM
13 years 7 months ago
Unsupervised document classification using sequential information maximization
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Noam Slonim, Nir Friedman, Naftali Tishby
TKDE
2011
280views more  TKDE 2011»
13 years 2 months ago
Locally Consistent Concept Factorization for Document Clustering
—Previous studies have demonstrated that document clustering performance can be improved significantly in lower dimensional linear subspaces. Recently, matrix factorization base...
Deng Cai, Xiaofei He, Jiawei Han