We describe a machine learning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model t...
Dustin Hillard, Stefan Schroedl, Eren Manavoglu, H...
The backends of today’s Internet services rely heavily on caching at various layers both to provide faster service to common requests and to reduce load on back-end components. ...
Alexander Rasmussen, Emre Kiciman, V. Benjamin Liv...
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
We describe Dispute Finder, a browser extension that alerts a user when information they read online is disputed by a source that they might trust. Dispute Finder examines the tex...
In this paper, we address the question of how we can identify hosts that will generate links to web spam. Detecting such spam link generators is important because almost all new s...