Search Sciweavers | Sciweavers

195

WSDM
2009
ACM

176views Data Mining» more WSDM 2009»

The web changes everything: understanding the dynamics of web content

16 years 1 months ago

The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different...

Eytan Adar, Jaime Teevan, Susan T. Dumais, Jonatha...

claim paper

Read More »

173

click to vote

ICS
2010
Tsinghua U.

268views Distributed And Parallel Com...» more ICS 2010»

Local Algorithms for Finding Interesting Individuals in Large Networks

16 years 4 months ago

Download www.seas.upenn.edu

: We initiate the study of local, sublinear time algorithms for finding vertices with extreme topological properties -- such as high degree or clustering coefficient -- in large so...

Mickey Brautbar, Michael Kearns

claim paper

Read More »

200

click to vote

WWW
2006
ACM

237views Internet Technology» more WWW 2006»

Effective web-scale crawling through website analysis

16 years 7 months ago

Download people.csail.mit.edu

The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...

Iván Gonzlez, Adam Marcus 0002, Daniel N. M...

claim paper

Read More »

163

click to vote

ADBIS
2004
Springer

113views Database» more ADBIS 2004»

Ipmicra: Toward a Distributed and Adaptable Location Aware Web Crawler

16 years 1 days ago

Download www.sztaki.hu

Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...

Odysseas Papapetrou, George Samaras

claim paper

Read More »

151

click to vote

IADIS
2004

130views Internet Technology» more IADIS 2004»

Crawling the client-side hidden web

15 years 8 months ago

Download www.tic.udc.es

There is a great amount of information on the web that can not be accessed by conventional crawler engines. This portion of the web is usually called hidden web data. To be able t...

Manuel Álvarez, Alberto Pan, Juan Raposo, &...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers