Sciweavers

213 search results - page 5 / 43
» Combining Statistics and Semantics for Word and Document Clu...
Sort
View
NLDB
2007
Springer
14 years 1 months ago
Selecting Labels for News Document Clusters
This work deals with determination of meaningful and terse cluster labels for News document clusters. We analyze a number of alternatives for selecting headlines and/or sentences o...
Krishnaprasad Thirunarayan, Trivikram Immaneni, Ma...
IJCAI
2007
13 years 9 months ago
Multi-Document Summarization by Maximizing Informative Content-Words
We show that a simple procedure based on maximizing the number of informative content-words can produce some of the best reported results for multi-document summarization. We fir...
Wen-tau Yih, Joshua Goodman, Lucy Vanderwende, His...
CIS
2005
Springer
14 years 1 months ago
Concept Chain Based Text Clustering
Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic re...
Shaoxu Song, Jian Zhang, Chunping Li
SIGIR
1999
ACM
14 years 23 hour ago
Probabilistic Latent Semantic Indexing
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fit...
Thomas Hofmann
SIGIR
2008
ACM
13 years 7 months ago
Knowledge transformation from word space to document space
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
Tao Li, Chris H. Q. Ding, Yi Zhang 0005, Bo Shao