Sciweavers

110 search results - page 5 / 22
» A Comparison of Two Document Clustering Approaches for Clust...
Sort
View
SIGIR
2009
ACM
14 years 2 months ago
A comparison of retrieval-based hierarchical clustering approaches to person name disambiguation
This paper describes a simple clustering approach to person name disambiguation of retrieved documents. The methods are based on standard IR concepts and do not require any task-s...
Christof Monz, Wouter Weerkamp
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
14 years 8 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
SIGIR
2009
ACM
14 years 2 months ago
A latent topic model for linked documents
Documents in many corpora, such as digital libraries and webpages, contain both content and link information. To explicitly consider the document relations represented by links, i...
Zhen Guo, Shenghuo Zhu, Yun Chi, Zhongfei Zhang, Y...
SIGIR
2006
ACM
14 years 1 months ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan
ICPR
2010
IEEE
13 years 5 months ago
NAVIDOMASS: Structural-based Approaches Towards Handling Historical Documents
In the context of the NAVIDOMASS project, the problematic of this paper concerns the clustering of historical document images. We propose a structural-based framework to handle the...
Salim Jouili, Mickaël Coustaty, Salvatore Tab...