Sciweavers

498 search results - page 63 / 100
» Robust web content extraction
Sort
View
ICIP
2009
IEEE
14 years 8 months ago
Physics-based Illuminant Color Estimation As An Image Semantics Clue
Most algorithms for extracting illuminant chromaticity from arbitrary images, such as the images found on the web, are based on machine learning techniques. We will show how a phy...
CIKM
2006
Springer
13 years 11 months ago
Mining blog stories using community-based and temporal clustering
In recent years, weblogs, or blogs for short, have become an important form of online content. The personal nature of blogs, online interactions between bloggers, and the temporal...
Arun Qamra, Belle L. Tseng, Edward Y. Chang
KDD
2004
ACM
160views Data Mining» more  KDD 2004»
14 years 7 months ago
Boosting for Text Classification with Semantic Features
Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...
Stephan Bloehdorn, Andreas Hotho
WWW
2009
ACM
14 years 8 months ago
Link based small sample learning for web spam detection
Robust statistical learning based web spam detection system often requires large amounts of labeled training data. However, labeled samples are more difficult, expensive and time ...
Guanggang Geng, Qiudan Li, Xinchang Zhang
WWW
2007
ACM
14 years 8 months ago
Integrating web directories by learning their structures
Documents in the Web are often organized using category trees by information providers (e.g. CNN, BBC) or search engines (e.g. Google, Yahoo!). Such category trees are commonly kn...
Christopher C. Yang, Jianfeng Lin