Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpr...
This paper is concerned with actively predicting search intent from user browsing behavior data. In recent years, great attention has been paid to predicting user search intent. H...
In this paper, we address the question of how we can identify hosts that will generate links to web spam. Detecting such spam link generators is important because almost all new s...
In a social news website people share content they found on the web, called news, then vote for those they like the most. Voting for a news is then considered as a recommendation,...
Thomas Largillier, Guillaume Peyronnet, Sylvain Pe...
We present a method for automatically acquiring of a corpus of disputed claims from the web. We consider a factual claim to be disputed if a page on the web suggests both that the...
Rob Ennals, Dan Byler, John Mark Agosta, Barbara R...
Web search engines discover indexable documents by recursively ‘crawling’ from a seed URL. Their rankings take into account link popularity. While this works well, it introduc...
Tom Rowlands, David Hawking, Ramesh Sankaranarayan...
Advertising has become an integral and inseparable part of the World Wide Web. However, neither public auditing nor monitoring mechanisms still exist in this emerging area. In thi...
Yong Wang, Daniel Burgener, Aleksandar Kuzmanovic,...
In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based o...