Sciweavers

116 search results - page 8 / 24
» A machine learning approach to web page filtering using cont...
Sort
View
WWW
2004
ACM
14 years 8 months ago
Using urls and table layout for web classification tasks
We propose new features and algorithms for automating Web-page classification tasks such as content recommendation and ad blocking. We show that the automated classification of We...
L. K. Shih, David R. Karger
CIKM
2004
Springer
14 years 1 months ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann
CIKM
2006
Springer
13 years 11 months ago
Knowing a web page by the company it keeps
Web page classification is important to many tasks in information retrieval and web mining. However, applying traditional textual classifiers on web data often produces unsatisfyi...
Xiaoguang Qi, Brian D. Davison
AIRWEB
2009
Springer
14 years 2 months ago
Looking into the past to better classify web spam
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
Na Dai, Brian D. Davison, Xiaoguang Qi
DILS
2009
Springer
14 years 2 months ago
Site-Wide Wrapper Induction for Life Science Deep Web Databases
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
Saqib Mir, Steffen Staab, Isabel Rojas