Sciweavers

326 search results - page 9 / 66
» Optimal crawling strategies for web search engines
Sort
View
WWW
2007
ACM
14 years 8 months ago
Crawling multiple UDDI business registries
As Web services proliferate, size and magnitude of UDDI Business Registries (UBRs) are likely to increase. The ability to discover Web services of interest then across multiple UB...
Eyhab Al-Masri, Qusay H. Mahmoud
WWW
2006
ACM
14 years 1 months ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
WEBDB
2005
Springer
102views Database» more  WEBDB 2005»
14 years 27 days ago
Design and Implementation of a Geographic Search Engine
In this paper, we describe the design and initial implementation of a geographic search engine prototype for Germany, based on a large crawl of the de domain. Geographic search en...
Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xi...
CN
1999
242views more  CN 1999»
13 years 7 months ago
Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
WWW
2011
ACM
13 years 2 months ago
Inverted index compression via online document routing
Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled. Due...
Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh