Sciweavers

1133 search results - page 4 / 227
» Distributed community crawling
Sort
View
HT
2003
ACM
14 years 18 days ago
Extracting evolution of web communities from a series of web archives
Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us to observe evolution of the Web. In this pap...
Masashi Toyoda, Masaru Kitsuregawa
PDP
2008
IEEE
14 years 1 months ago
Bulk-Synchronous On-Line Crawling on Clusters of Computers
This paper describes the design of a crawler devised to perform the periodic retrieval of Web documents for a search engine able to accept on-line updates in a concurrent manner. ...
Mauricio Marín, Carolina Bonacic
ITSSA
2006
581views more  ITSSA 2006»
13 years 7 months ago
Agent-Based Approach for Web Crawling
: Since its creation in 1990, World Wide Web has increased the popularity of Internet which becomes an important source of information or services for all people over the world. Th...
Maxime Wack, Mohamed Bakhouya, Jaafar Gaber
SAC
2003
ACM
14 years 18 days ago
Ontology-Focused Crawling of Web Documents
The Web, the largest unstructured database of the world, has greatly improved access to documents. However, documents on the Web are largely disorganized. Due to the distributed n...
Marc Ehrig, Alexander Maedche
SIGIR
2008
ACM
13 years 7 months ago
Compressed collections for simulated crawling
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Alessio Orlandi, Sebastiano Vigna