Automatic Word Clustering for Text Categorization Using Global Information

15 years 12 months ago

Download www.nlplab.cn

This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global information over all the clusters. The model can group words into clusters based on the distribution of class labels associated with each word. Using these learned clusters as features, we develop a cluster-based classiﬁer. We present several experimental results to show that our proposed method performs better than the other three text classiﬁers. The proposed model has better results than the model which only considers the information of the two related clusters. Specially, it can maintain good performance when the number of features is small and the size of training corpus is small. Categories and Subject Descriptors I.5.3 [PATTERN RECOGNITION]: Clustering—Similarity measures; I.5.4 [PATTERN RECOGNITION]: Application—Text processing; I.5.m [PATTERN RECOGNITION]: Miscellaneous General Terms Algorithms...

Wenliang Chen, Xingzhi Chang, Huizhen Wang, Jingbo

Real-time Traffic

AIRS 2004 | Class Distributional Clustering | Information Retrieval | Pattern Recognition | Text Categorization |

claim paper

» The Role of Word Sense Disambiguation in Automated Text Categorization

» Categorical Proportional Difference A Feature Selection Method for Text Categorization

» Text categorization based on the ratio of word frequency in each categories

» Text categorization by boosting automatically extracted concepts

» WordNet and Automated Text Summarization

» Overcoming the Brittleness Bottleneck using Wikipedia Enhancing Text Categorization with E...

» Analyzing the Temporal Sequences for Text Categorization

» Revealing Relations between Open and Closed Answers in Questionnaires through Text Cluster...

Post Info
More Details (n/a)

Added	30 Jun 2010
Updated	30 Jun 2010
Type	Conference
Year	2004
Where	AIRS
Authors	Wenliang Chen, Xingzhi Chang, Huizhen Wang, Jingbo Zhu, Tianshun Yao

Comments (0)

Sciweavers

Automatic Word Clustering for Text Categorization Using Global Information

AIRS 2004 | Class Distributional Clustering | Information Retrieval | Pattern Recognition | Text Categorization |

Explore & Download

Productivity Tools

Sciweavers