Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

162

DMIN
2006

114views Data Mining» more DMIN 2006»

Towards Using Fewer Features for Text Classification

15 years 8 months ago

Towards Using Fewer Features for Text Classification

Download ww1.ucmss.com

Abstract-- Text classification or categorization is a conventional classification problem applied to the text domain. In the cases when statistical classification methods are used, an important research issue is the selection of features from the training texts, each of which is hence treated as a feature vector. In this paper, we propose an approach for feature selection in text classification tasks, based on the exploit of external information that summarizes the text to be classified. In particular, we study the use of their citation contexts in the categorization of academic publications using the Naive Bayesian method. A series of experiments have been performed on a corpus of publications in Computer Science, based on which we observe that publication citation contexts can serve as a liable and effective source of feature selection. We also derive some useful hints on the reduction of feature number with a negligible affects on the accuracies.1

Yuan Yuan, Tianyang Gu

Real-time Traffic

Citation Contexts | DMIN 2006 | DMIN 2007 | Feature Selection | Text Classification |

claim paper

Related Content

» Use Fewer Instances of the Letter i Toward Writing Style Anonymization

» Complex Linguistic Features for Text Classification A Comprehensive Study

» Contentbased audio classification and retrieval using a fuzzy logic system towards multime...

» Feature Selection Tomography Illustrating that Optimal Feature Filtering is Hopelessly Un...

» Classification of ProteinProtein Interaction FullText Documents Using Text and Citation Ne...

» Text classification using multiword features

» Towards Intelligent Mission Profiles of Micro Air Vehicles Multiscale Viterbi Classificati...

» Using feature construction to avoid large feature spaces in text classification

» Systematic feature evaluation for gene name recognition

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	DMIN
Authors	Yuan Yuan, Tianyang Gu

Comments (0)