Sciweavers

38 search results - page 5 / 8
» The indexable web is more than 11.5 billion pages
Sort
View
CN
1998
207views more  CN 1998»
13 years 10 months ago
The Anatomy of a Large-Scale Hypertextual Web Search Engine
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the...
Sergey Brin, Lawrence Page
WWW
2005
ACM
14 years 11 months ago
Scaling link-based similarity search
To exploit the similarity information hidden in the hyperlink structure of the web, this paper introduces algorithms scalable to graphs with billions of vertices on a distributed ...
Balázs Rácz, Dániel Fogaras
JCDL
2010
ACM
188views Education» more  JCDL 2010»
14 years 3 months ago
Exposing the hidden web for chemical digital libraries
In recent years, the vast amount of digitally available content has lead to the creation of many topic-centered digital libraries. Also in the domain of chemistry more and more di...
Sascha Tönnies, Benjamin Köhncke, Oliver...
PVLDB
2008
124views more  PVLDB 2008»
13 years 9 months ago
Google's Deep Web crawl
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structu...
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Gana...
AIRWEB
2006
Springer
14 years 2 months ago
Improving Cloaking Detection using Search Query Popularity and Monetizability
Cloaking is a search engine spamming technique used by some Web sites to deliver one page to a search engine for indexing while serving an entirely different page to users browsin...
Kumar Chellapilla, David Maxwell Chickering