Search Sciweavers | Sciweavers

160

WIDM
2006
ACM

95views Internet Technology» more WIDM 2006»

Lazy preservation: reconstructing websites by crawling the crawlers

16 years 19 days ago

Backup of websites is often not considered until after a catastrophic event has occurred to either the website or its webmaster. We introduce “lazy preservation” – digital p...

Frank McCown, Joan A. Smith, Michael L. Nelson

claim paper

Read More »

257

Voted

CORR
2012
Springer

292views Education» more CORR 2012»

Optimal Threshold Control by the Robots of Web Search Engines with Obsolescence of Documents

14 years 2 months ago

Download www-sop.inria.fr

A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...

Konstantin Avrachenkov, Alexander N. Dudin, Valent...

claim paper

Read More »

148

click to vote

WWW
2007
ACM

198views Internet Technology» more WWW 2007»

Parallel crawling for online social networks

16 years 7 months ago

Download www2007.org

Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers tha...

Duen Horng Chau, Shashank Pandit, Samuel Wang, Chr...

claim paper

Read More »

203

click to vote

SIGIR
2002
ACM

78views Information Technology» more SIGIR 2002»

Do TREC web collections look like the web?

15 years 6 months ago

Download www.sigir.org

We measure the WT10g test collection, used in the TREC-9 and TREC 2001 Web Tracks, and the .GOV test collection used in the TREC 2002 Web and Interactive Tracks, with common measu...

Ian Soboroff

claim paper

Read More »

183

click to vote

ERCIMDL
2003
Springer

106views Education» more ERCIMDL 2003»

Topical Crawling for Business Intelligence

15 years 12 months ago

Download dollar.biz.uiowa.edu

Abstract. The Web provides us with a vast resource for business intelligence. However, the large size of the Web and its dynamic nature make the task of foraging appropriate inform...

Gautam Pant, Filippo Menczer

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers