Sciweavers

178 search results - page 9 / 36
» Scheduling Algorithms for Web Crawling
Sort
View
WWW
2007
ACM
14 years 8 months ago
Efficient Update of Indexes for Dynamically Changing Web Documents
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
WWW
2005
ACM
14 years 8 months ago
Exploiting the deep web with DynaBot: matching, probing, and ranking
We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, p...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...
WISE
2007
Springer
14 years 1 months ago
Querying Capability Modeling and Construction of Deep Web Sources
Information in a deep Web source can be accessed through queries submitted on its query interface. Many Web applications need to interact with the query interfaces of deep Web sour...
Liangcai Shu, Weiyi Meng, Hai He, Clement T. Yu
IADIS
2003
13 years 9 months ago
SPLAT: A System for Self-Plagiarism Detection
This paper presents a system for self-plagiarism detection, SPLAT. The system uses a WebL web spider that crawls through the web sites of the top fifty Computer Science department...
Christian S. Collberg, Stephen G. Kobourov, Joshua...
WEBDB
2005
Springer
129views Database» more  WEBDB 2005»
14 years 1 months ago
Searching for Hidden-Web Databases
Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Alt...
Luciano Barbosa, Juliana Freire