Sciweavers

295 search results - page 30 / 59
» Web Crawling
Sort
View
CORR
2010
Springer
102views Education» more  CORR 2010»
13 years 10 months ago
MIREX: MapReduce Information Retrieval Experiments
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use...
Djoerd Hiemstra, Claudia Hauff
WWW
2010
ACM
14 years 1 months ago
Time is of the essence: improving recency ranking using Twitter data
Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including ...
Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Ba...
DEXA
2010
Springer
226views Database» more  DEXA 2010»
13 years 8 months ago
Vi-DIFF: Understanding Web Pages Changes
Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...
Zeynep Pehlivan, Myriam Ben Saad, Stéphane ...
WWW
2010
ACM
14 years 4 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
SEMWEB
2007
Springer
14 years 4 months ago
Sindice.com: Weaving the Open Linked Data
Developers of Semantic Web applications face a challenge with respect to the decentralised publication model: where to find statements about encountered resources. The “linked d...
Giovanni Tummarello, Renaud Delbru, Eyal Oren