Sciweavers

295 search results - page 29 / 59
» Web Crawling
Sort
View
EDBTW
2010
Springer
13 years 8 months ago
Using visual pages analysis for optimizing web archiving
Due to the growing importance of the World Wide Web, archiving it has become crucial for preserving useful source of information. To maintain a web archive up-to-date, crawlers ha...
Myriam Ben Saad, Stéphane Gançarski
WWW
2008
ACM
14 years 10 months ago
Geographic web usage estimation by monitoring DNS caches
DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa....
Hüseyin Akcan, Torsten Suel, Hervé Br&...
WWW
2011
ACM
13 years 4 months ago
we.b: the web of short urls
Short URLs have become ubiquitous. Especially popular within social networking services, short URLs have seen a significant increase in their usage over the past years, mostly du...
Demetres Antoniades, Iasonas Polakis, Georgios Kon...
EDBT
2006
ACM
137views Database» more  EDBT 2006»
14 years 10 months ago
IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking
Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query rou...
Sebastian Michel, Matthias Bender, Peter Triantafi...
CLEF
2010
Springer
13 years 11 months ago
MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use ...
Djoerd Hiemstra, Claudia Hauff