Search Sciweavers | Sciweavers

177

KDD
2002
ACM

115views Data Mining» more KDD 2002»

Collaborative crawling: mining user experiences for topical resource discovery

16 years 7 months ago

The rapid growth of the world wide web had made the problem of topic speci c resource discovery an important one in recent years. In this problem, it is desired to nd web pages wh...

Charu C. Aggarwal

claim paper

Read More »

202

click to vote

WWW
2003
ACM

133views Internet Technology» more WWW 2003»

Efficient URL caching for world wide web crawling

16 years 7 months ago

Download research.microsoft.com

Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...

Andrei Z. Broder, Marc Najork, Janet L. Wiener

claim paper

Read More »

223

Voted

IR
2008

189views Natural Language Processing» more IR 2008»

Focused web crawling in the acquisition of comparable corpora

15 years 6 months ago

Download www.info.uta.fi

CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...

Tuomas Talvensaari, Ari Pirkola, Kalervo Järv...

claim paper

Read More »

231

Voted

CIKM
2010
Springer

166views Information Technology» more CIKM 2010»

Crawling the web for structured documents

15 years 3 months ago

Download www.mendeley.com

Structured Information Retrieval is gaining a lot of interest in recent years, as this kind of information is becoming an invaluable asset for professional communities such as Sof...

Julián Urbano, Juan Loréns, Yorgos A...

claim paper

Read More »

197

click to vote

HICSS
1999
IEEE

178views Biometrics» more HICSS 1999»

Collaborative Web Crawling: Information Gathering/Processing over Internet

15 years 11 months ago

Download www.almaden.ibm.com

The main objective of the IBM Grand Central Station (GCS) is to gather information of virtually any type of formats (text, data, image, graphics, audio, video) from the cyberspace...

Shang-Hua Teng, Qi Lu, Matthias Eichstaedt, Daniel...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers