Sciweavers

CIKM
2010
Springer

Fast dimension reduction for document classification based on imprecise spectrum analysis

13 years 8 months ago
Fast dimension reduction for document classification based on imprecise spectrum analysis
This paper proposes an algorithm called Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA is designed based on the one-sided Jacobi method for Singular Value Decomposition (SVD). To speedup dimension reduction, it simplifies the orthogonalization process of Jacobi computation and introduces a new mapping formula for transforming original documentterm vectors. To improve classification accuracy using ISA, a feature selection method is further developed to make inter-class feature vectors more orthogonal in building the initial weighted term-document matrix. Our experimental results show that ISA is extremely fast in handling large term-document matrices and delivers better or competitive classification accuracy compared to SVD-based LSI. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing General Terms Algorithms, Experimentation, Performance Keywords Feature Selection, LSI, S...
Hu Guan, Bin Xiao, Jingyu Zhou, Minyi Guo, Tao Yan
Added 10 Feb 2011
Updated 10 Feb 2011
Type Journal
Year 2010
Where CIKM
Authors Hu Guan, Bin Xiao, Jingyu Zhou, Minyi Guo, Tao Yang
Comments (0)