Sciweavers

611 search results - page 5 / 123
» Random web crawls
Sort
View
WSDM
2009
ACM
176views Data Mining» more  WSDM 2009»
14 years 2 months ago
The web changes everything: understanding the dynamics of web content
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different...
Eytan Adar, Jaime Teevan, Susan T. Dumais, Jonatha...
ICS
2010
Tsinghua U.
14 years 5 months ago
Local Algorithms for Finding Interesting Individuals in Large Networks
: We initiate the study of local, sublinear time algorithms for finding vertices with extreme topological properties -- such as high degree or clustering coefficient -- in large so...
Mickey Brautbar, Michael Kearns
WWW
2006
ACM
14 years 8 months ago
Effective web-scale crawling through website analysis
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
Iván Gonzlez, Adam Marcus 0002, Daniel N. M...
ADBIS
2004
Springer
113views Database» more  ADBIS 2004»
14 years 1 months ago
Ipmicra: Toward a Distributed and Adaptable Location Aware Web Crawler
Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...
Odysseas Papapetrou, George Samaras
IADIS
2004
13 years 9 months ago
Crawling the client-side hidden web
There is a great amount of information on the web that can not be accessed by conventional crawler engines. This portion of the web is usually called hidden web data. To be able t...
Manuel Álvarez, Alberto Pan, Juan Raposo, &...