Sciweavers

106 search results - page 11 / 22
» Document Representation and Dimension Reduction for Text Clu...
Sort
View
SIGIR
2008
ACM
13 years 7 months ago
Enhancing text clustering by leveraging Wikipedia semantics
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...
COLING
2008
13 years 9 months ago
A Framework for Identifying Textual Redundancy
The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
Kapil Thadani, Kathleen McKeown
NIPS
2008
13 years 9 months ago
DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification
Probabilistic topic models have become popular as methods for dimensionality reduction in collections of text documents or images. These models are usually treated as generative m...
Simon Lacoste-Julien, Fei Sha, Michael I. Jordan
RIAO
2004
13 years 9 months ago
Multilingual document clusters discovery
Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...
Benoît Mathieu, Romaric Besançon, Chr...
DAS
2004
Springer
14 years 1 months ago
Unity Is Strength: Coupling Media for Thematic Segmentation
Abstract. This paper presents the evaluation methods and the preliminary results of a combined thematic segmentation of (a) meeting documents and (b)meeting speech transcript. Our ...
Dalila Mekhaldi, Denis Lalanne, Rolf Ingold