Search Sciweavers | Sciweavers

165

WWW
2006
ACM

96views Internet Technology» more WWW 2006»

What's really new on the web?: identifying new pages from a series of unstable web snapshots

16 years 7 months ago

Identifying and tracking new information on the Web is important in sociology, marketing, and survey research, since new trends might be apparent in the new information. Such chan...

Masashi Toyoda, Masaru Kitsuregawa

claim paper

Read More »

201

Voted

WWW
2003
ACM

133views Internet Technology» more WWW 2003»

Efficient URL caching for world wide web crawling

16 years 7 months ago

Download research.microsoft.com

Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...

Andrei Z. Broder, Marc Najork, Janet L. Wiener

claim paper

Read More »

195

click to vote

HICSS
1999
IEEE

178views Biometrics» more HICSS 1999»

Collaborative Web Crawling: Information Gathering/Processing over Internet

15 years 11 months ago

Download www.almaden.ibm.com

The main objective of the IBM Grand Central Station (GCS) is to gather information of virtually any type of formats (text, data, image, graphics, audio, video) from the cyberspace...

Shang-Hua Teng, Qi Lu, Matthias Eichstaedt, Daniel...

claim paper

Read More »

143

Voted

SIGIR
2008
ACM

116views Information Technology» more SIGIR 2008»

Exploring traversal strategy for web forum crawling

15 years 6 months ago

Download research.microsoft.com

In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...

Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...

claim paper

Read More »

164

click to vote

CN
2000

75views more CN 2000»

Graph structure in the Web

15 years 6 months ago

Download www.cis.upenn.edu

The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and...

Andrei Z. Broder, Ravi Kumar, Farzin Maghoul, Prab...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers