Sciweavers

295 search results - page 48 / 59
» Web Crawling
Sort
View
WWW
2001
ACM
14 years 10 months ago
Effective Web data extraction with standard XML technologies
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
Jussi Myllymaki
WWW
2011
ACM
13 years 4 months ago
Prophiler: a fast filter for the large-scale detection of malicious web pages
Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In...
Davide Canali, Marco Cova, Giovanni Vigna, Christo...
ECIR
2006
Springer
13 years 11 months ago
Automatic Document Organization in a P2P Environment
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
Stefan Siersdorfer, Sergej Sizov
SIGIR
2006
ACM
14 years 3 months ago
AggregateRank: bringing order to web sites
Since the website is one of the most important organizational structures of the Web, how to effectively rank websites has been essential to many Web applications, such as Web sear...
Guang Feng, Tie-Yan Liu, Ying Wang, Ying Bao, Zhim...
ICIP
2000
IEEE
14 years 11 months ago
Efficient Video Similarity Measurement and Search
We consider the use of meta-data and/or video-domain methods to detect similar videos on the web. Meta-data is extracted from the textual and hyperlink information associated with...
Sen-Ching S. Cheung, Avideh Zakhor