Sciweavers

101 search results - page 9 / 21
» First-order focused crawling
Sort
View
SIGIR
2008
ACM
13 years 6 months ago
Compressed collections for simulated crawling
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Alessio Orlandi, Sebastiano Vigna
ICDE
2006
IEEE
146views Database» more  ICDE 2006»
14 years 8 months ago
Query Selection Techniques for Efficient Crawling of Structured Web Sources
The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma
IAT
2009
IEEE
14 years 1 months ago
Intelligent Crawling in Virtual Worlds
—We present an intelligent agent crawler designed to collect user-generated content in Second Life and related virtual worlds. The agents navigate autonomously through the world ...
Josh Eno, Susan Gauch, Craig W. Thompson
SIGMOD
2006
ACM
232views Database» more  SIGMOD 2006»
14 years 6 months ago
To search or to crawl?: towards a query optimizer for text-centric tasks
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...
Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...
WWW
2005
ACM
14 years 7 months ago
User-centric Web crawling
Search engines are the primary gateways of information access on the Web today. Behind the scenes, search engines crawl the Web to populate a local indexed repository of Web pages...
Sandeep Pandey, Christopher Olston