Sciweavers

563 search results - page 5 / 113
» Crawling the web for structured documents
Sort
View
SIGIR
2008
ACM
13 years 8 months ago
Compressed collections for simulated crawling
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Alessio Orlandi, Sebastiano Vigna
ISM
2008
IEEE
127views Multimedia» more  ISM 2008»
14 years 2 months ago
LeeDeo: Web-Crawled Academic Video Search Engine
We present our vision and preliminary design toward web-crawled academic video search engine, named as LeeDeo, that can search, crawl, archive, index, and browse “academic” vi...
Dongwon Lee, Hung-sik Kim, Eun Kyung Kim, Su Yan, ...
WWW
2010
ACM
14 years 3 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
ERCIMDL
2005
Springer
305views Education» more  ERCIMDL 2005»
14 years 2 months ago
Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines
Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number...
George Almpanidis, Constantine Kotropoulos, Ioanni...
WEBI
2007
Springer
14 years 2 months ago
Question Answering over Implicitly Structured Web Content
Implicitly structured content on the Web such as HTML tables and lists can be extremely valuable for web search, question answering, and information retrieval, as the implicit str...
Eugene Agichtein, Chris Burges, Eric Brill