Sciweavers

ACL
2009

A Framework of Feature Selection Methods for Text Categorization

13 years 9 months ago
A Framework of Feature Selection Methods for Text Categorization
In text categorization, feature selection (FS) is a strategy that aims at making text classifiers more efficient and accurate. However, when dealing with a new task, it is still difficult to quickly select a suitable one from various FS methods provided by many previous studies. In this paper, we propose a theoretic framework of FS methods based on two basic measurements: frequency measurement and ratio measurement. Then six popular FS methods are in detail discussed under this framework. Moreover, with the guidance of our theoretical analysis, we propose a novel method called weighed frequency and odds (WFO) that combines the two measurements with trained weights. The experimental results on data sets from both topic-based and sentiment classification tasks show that this new method is robust across different tasks and numbers of selected features.
Shoushan Li, Rui Xia, Chengqing Zong, Chu-Ren Huan
Added 16 Feb 2011
Updated 16 Feb 2011
Type Journal
Year 2009
Where ACL
Authors Shoushan Li, Rui Xia, Chengqing Zong, Chu-Ren Huang
Comments (0)