Sciweavers

55 search results - page 6 / 11
» An Analysis on Topic Features and Difficulties Based on Web ...
Sort
View
CIKM
2009
Springer
14 years 1 months ago
Vetting the links of the web
Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, ...
Na Dai, Brian D. Davison
HT
2005
ACM
14 years 8 days ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
WWW
2008
ACM
14 years 7 months ago
Detecting image spam using visual features and near duplicate detection
Email spam is a much studied topic, but even though current email spam detecting software has been gaining a competitive edge against text based email spam, new advances in spam g...
Bhaskar Mehta, Saurabh Nangia, Manish Gupta 0002, ...
KDD
2001
ACM
231views Data Mining» more  KDD 2001»
14 years 7 months ago
A Framework for Efficient and Anonymous Web Usage Mining Based on Client-Side Tracking
Web Usage Mining (WUM), a natural application of data mining techniques to the data collected from user interactions with the web, has greatly concerned both academia and industry ...
Cyrus Shahabi, Farnoush Banaei Kashani
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
14 years 7 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho