Sciweavers

295 search results - page 20 / 59
» Web Crawling
Sort
View
WWW
2002
ACM
14 years 10 months ago
Parallel crawlers
In this paper we study how we can design an effective parallel crawler. As the size of the Web grows, it becomes imperative to parallelize a crawling process, in order to finish d...
Junghoo Cho, Hector Garcia-Molina
ADCS
2004
13 years 11 months ago
Focused Crawling in Depression Portal Search: A Feasibility Study
Previous work on domain specific search services in the area of depressive illness has documented the significant human cost required to setup and maintain closed-crawl parameters....
Thanh Tin Tang, David Hawking, Nick Craswell, Rame...
SIGMOD
2006
ACM
232views Database» more  SIGMOD 2006»
14 years 10 months ago
To search or to crawl?: towards a query optimizer for text-centric tasks
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...
Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...
ICMLA
2008
13 years 11 months ago
A Fully Automatic Crossword Generator
This paper presents a software system that is able to generate crosswords with no human intervention including definition generation and crossword compilation. In particular, the ...
Leonardo Rigutini, Michelangelo Diligenti, Marco M...
WWW
2008
ACM
14 years 10 months ago
IRLbot: scaling to 6 billion pages and beyond
This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmit...