Sciweavers

81 search results - page 3 / 17
» Learning to Separate Text Content and Style for Classificati...
Sort
View
AIRWEB
2006
Springer
13 years 11 months ago
Tracking Web Spam with Hidden Style Similarity
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites powered...
Tanguy Urvoy, Thomas Lavergne, Pascal Filoche
SIGIR
2000
ACM
13 years 12 months ago
Hierarchical classification of Web content
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train diffe...
Susan T. Dumais, Hao Chen
CIKM
2006
Springer
13 years 11 months ago
Performance thresholding in practical text classification
In practical classification, there is often a mix of learnable and unlearnable classes and only a classifier above a minimum performance threshold can be deployed. This problem is...
Hinrich Schütze, Emre Velipasaoglu, Jan O. Pe...
KDD
2002
ACM
147views Data Mining» more  KDD 2002»
14 years 8 months ago
A parallel learning algorithm for text classification
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify te...
Canasai Kruengkrai, Chuleerat Jaruskulchai
SMC
2010
IEEE
186views Control Systems» more  SMC 2010»
13 years 6 months ago
Semantic enrichment of text representation with wikipedia for text classification
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
Hiroki Yamakawa, Jing Peng, Anna Feldman