Sciweavers

72 search results - page 10 / 15
» Ontology-Focused Crawling of Web Documents
Sort
View
CORR
2010
Springer
102views Education» more  CORR 2010»
13 years 8 months ago
MIREX: MapReduce Information Retrieval Experiments
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use...
Djoerd Hiemstra, Claudia Hauff
EDBT
2006
ACM
137views Database» more  EDBT 2006»
14 years 8 months ago
IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking
Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query rou...
Sebastian Michel, Matthias Bender, Peter Triantafi...
WWW
2010
ACM
13 years 11 months ago
Time is of the essence: improving recency ranking using Twitter data
Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including ...
Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Ba...
WWW
2010
ACM
14 years 2 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
IRI
2007
IEEE
14 years 2 months ago
Acronym-Expansion Recognition and Ranking on the Web
The paper presents a study on large-scale automatic extraction of acronyms and associated expansions from Web data and from the user interactions with this data through Web search...
Alpa Jain, Silviu Cucerzan, Saliha Azzam