Sciweavers

606 search results - page 57 / 122
» A novel feature selection algorithm for text categorization
Sort
View
WWW
2008
ACM
14 years 9 months ago
Floatcascade learning for fast imbalanced web mining
This paper is concerned with the problem of Imbalanced Classification (IC) in web mining, which often arises on the web due to the "Matthew Effect". As web IC applicatio...
Xiaoxun Zhang, Xueying Wang, Honglei Guo, Zhili Gu...
SIGIR
2006
ACM
14 years 2 months ago
Latent semantic analysis for multiple-type interrelated data objects
Co-occurrence data is quite common in many real applications. Latent Semantic Analysis (LSA) has been successfully used to identify semantic relations in such data. However, LSA c...
Xuanhui Wang, Jian-Tao Sun, Zheng Chen, ChengXiang...
JMLR
2006
125views more  JMLR 2006»
13 years 8 months ago
Spam Filtering Using Statistical Data Compression Models
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade fi...
Andrej Bratko, Gordon V. Cormack, Bogdan Filipic, ...
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
14 years 9 months ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler
CIKM
2006
Springer
14 years 17 days ago
A comparative study on classifying the functions of web page blocks
In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, an...
Xiangye Xiao, Qiong Luo, Xing Xie, Wei-Ying Ma