Sciweavers

611 search results - page 11 / 123
» Random web crawls
Sort
View
WAW
2004
Springer
150views Algorithms» more  WAW 2004»
14 years 1 months ago
Do Your Worst to Make the Best: Paradoxical Effects in PageRank Incremental Computations
d Abstract) Paolo Boldi† Massimo Santini‡ Sebastiano Vigna∗ Deciding which kind of visit accumulates high-quality pages more quickly is one of the most often debated issue i...
Paolo Boldi, Massimo Santini, Sebastiano Vigna
SIGIR
2008
ACM
13 years 7 months ago
Compressed collections for simulated crawling
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Alessio Orlandi, Sebastiano Vigna
WWW
2006
ACM
14 years 8 months ago
What's really new on the web?: identifying new pages from a series of unstable web snapshots
Identifying and tracking new information on the Web is important in sociology, marketing, and survey research, since new trends might be apparent in the new information. Such chan...
Masashi Toyoda, Masaru Kitsuregawa
WWW
2008
ACM
14 years 8 months ago
Low-load server crawler: design and evaluation
This paper proposes a method of crawling Web servers connected to the Internet without imposing a high processing load. We are using the crawler for a field survey of the digital ...
Katsuko T. Nakahira, Tetsuya Hoshino, Yoshiki Mika...