Sciweavers

116 search results - page 13 / 24
» A machine learning approach to web page filtering using cont...
Sort
View
WIDM
2006
ACM
14 years 1 months ago
Coarse-grained classification of web sites by their structural properties
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...
Christoph Lindemann, Lars Littig
KDD
2010
ACM
277views Data Mining» more  KDD 2010»
13 years 11 months ago
Growing a tree in the forest: constructing folksonomies by integrating structured metadata
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured ...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
CIKM
2007
Springer
14 years 1 months ago
Structure and semantics for expressive text kernels
Several problems in text categorization are too hard to be solved by standard bag-of-words representations. Work in kernel-based learning has approached this problem by (i) consid...
Stephan Bloehdorn, Alessandro Moschitti
ECCV
2008
Springer
14 years 9 months ago
Learning Visual Shape Lexicon for Document Image Content Recognition
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev