Sciweavers

295 search results - page 25 / 59
» Web Crawling
Sort
View
ICCS
2007
Springer
14 years 1 months ago
Estimating the Change of Web Pages
This paper presents the estimation methods computing the probabilities of how many times web pages are downloaded and modified, respectively, in the future crawls. The methods can ...
Sung Jin Kim, Sang Ho Lee
MAICS
2004
13 years 11 months ago
Creation of a Style Independent Intelligent Autonomous Citation Indexer to Support Academic Research
This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. B...
Eric G. Berkowitz, Mohamed Reda Elkhadiri
WWW
2007
ACM
14 years 10 months ago
A large-scale study of robots.txt
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
Yang Sun, Ziming Zhuang, C. Lee Giles
ICDE
2002
IEEE
161views Database» more  ICDE 2002»
14 years 11 months ago
Design and Implementation of a High-Performance Distributed Web Crawler
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
Vladislav Shkapenyuk, Torsten Suel
IPM
2008
133views more  IPM 2008»
13 years 9 months ago
DistanceRank: An intelligent ranking algorithm for web pages
A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and ...
Ali Mohammad Zareh Bidoki, Nasser Yazdani