Sciweavers

1140 search results - page 7 / 228
» A Novel Ant-Based Clustering Approach for Document Clusterin...
Sort
View
COLING
2008
13 years 9 months ago
A Framework for Identifying Textual Redundancy
The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
Kapil Thadani, Kathleen McKeown
SIGIR
2003
ACM
14 years 28 days ago
Document clustering based on non-negative matrix factorization
In this paper, we propose a novel document clustering method based on the non-negative factorization of the termdocument matrix of the given document corpus. In the latent semanti...
Wei Xu, Xin Liu, Yihong Gong
CLEF
2011
Springer
12 years 7 months ago
A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
SIGIR
2008
ACM
13 years 7 months ago
Pagerank based clustering of hypertext document collections
Clustering hypertext document collection is an important task in Information Retrieval. Most clustering methods are based on document content and do not take into account the hype...
Konstantin Avrachenkov, Vladimir Dobrynin, Danil N...
KDD
2005
ACM
135views Data Mining» more  KDD 2005»
14 years 8 months ago
A hybrid unsupervised approach for document clustering
We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to ...
Mihai Surdeanu, Jordi Turmo, Alicia Ageno