The recent explosion of on-line information in Digital Libraries and on the World Wide Web has given rise to a number of query-based search engines and manually constructed topica...
Mehran Sahami, Salim Yusufali, Michelle Q. Wang Ba...
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by random...
Robert Pasley, Paul Clough, Ross S. Purves, Floria...
The core task of sponsored search is to retrieve relevant ads for the user’s query. Ads can be retrieved either by exact match, when their bid term is identical to the query, or...
Michael Bendersky, Evgeniy Gabrilovich, Vanja Josi...
Medical visual information retrieval has been a very active research area over the past ten years as an increasing amount of images is produced digitally and made available in the ...