Sciweavers

563 search results - page 4 / 113
» Crawling the web for structured documents
Sort
View
CORR
2012
Springer
292views Education» more  CORR 2012»
12 years 3 months ago
Optimal Threshold Control by the Robots of Web Search Engines with Obsolescence of Documents
A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...
Konstantin Avrachenkov, Alexander N. Dudin, Valent...
CN
1999
242views more  CN 1999»
13 years 7 months ago
Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
CN
2000
75views more  CN 2000»
13 years 7 months ago
Graph structure in the Web
The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and...
Andrei Z. Broder, Ravi Kumar, Farzin Maghoul, Prab...
WWW
2009
ACM
14 years 8 months ago
Sitemaps: above and beyond the crawl of duty
Comprehensive coverage of the public web is crucial to web search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages' o...
Uri Schonfeld, Narayanan Shivakumar
WEBDB
2005
Springer
129views Database» more  WEBDB 2005»
14 years 27 days ago
Searching for Hidden-Web Databases
Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Alt...
Luciano Barbosa, Juliana Freire