Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacity of any single machine. To handle the necessary data volumes and query through...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Motivated by the challenging task of designing “secure” vote storage mechanisms, we study information storage mechanisms that operate in extremely hostile environments. In suc...
Traditionally, collaborative filtering (CF) algorithms used for recommendation operate on complete knowledge. This makes these algorithms hard to employ in a decentralized contex...
In peer-to-peer networks, finding the appropriate answer for an information request, such as the answer to a query for RDF(S) data, depends on selecting the right peer in the netw...
Recommender systems have emerged in the past several years as an effective way to help people cope with the problem of information overload. One application in which they have bec...
We present a principled methodology for filtering news stories by formal measures of information novelty, and show how the techniques can be used to custom-tailor newsfeeds based ...
Evgeniy Gabrilovich, Susan T. Dumais, Eric Horvitz