Sciweavers

295 search results - page 4 / 59
» Web Crawling
Sort
View
ITSSA
2006
581views more  ITSSA 2006»
13 years 8 months ago
Agent-Based Approach for Web Crawling
: Since its creation in 1990, World Wide Web has increased the popularity of Internet which becomes an important source of information or services for all people over the world. Th...
Maxime Wack, Mohamed Bakhouya, Jaafar Gaber
ADMA
2009
Springer
142views Data Mining» more  ADMA 2009»
14 years 3 months ago
Crawling Deep Web Using a New Set Covering Algorithm
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...
Yan Wang, Jianguo Lu, Jessica Chen
ADBIS
2004
Springer
113views Database» more  ADBIS 2004»
14 years 1 months ago
Ipmicra: Toward a Distributed and Adaptable Location Aware Web Crawler
Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...
Odysseas Papapetrou, George Samaras
IADIS
2004
13 years 10 months ago
Crawling the client-side hidden web
There is a great amount of information on the web that can not be accessed by conventional crawler engines. This portion of the web is usually called hidden web data. To be able t...
Manuel Álvarez, Alberto Pan, Juan Raposo, &...
WWW
2007
ACM
14 years 9 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma