Sciweavers

88 search results - page 7 / 18
» Distributional Clustering of Words for Text Classification
Sort
View
PRIS
2004
13 years 8 months ago
Effect of Feature Smoothing Methods in Text Classification Tasks
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds o...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vid...
ACST
2006
13 years 8 months ago
Distributed hierarchical document clustering
This paper investigates the applicability of distributed clustering technique, called RACHET [1], to organize large sets of distributed text data. Although the authors of RACHET c...
Debzani Deb, M. Muztaba Fuad, Rafal A. Angryk
ACL
1994
13 years 8 months ago
A Corpus-Based Approach to Automatic Compound Extraction
An automatic compound retrieval method is proposed to extract compounds within a text message. It uses n-gram mutual information, relative frequency count and parts of speech as t...
Keh-Yih Su, Ming-Wen Wu, Jing-Shin Chang
ICML
2006
IEEE
14 years 8 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
TREC
2007
13 years 8 months ago
WIM at TREC 2007
This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristi...
Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu