Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank...
Oren Etzioni, Michael J. Cafarella, Doug Downey, S...
Many museum and library archives are digitizing their large collections of handwritten historical manuscripts to enable public access to them. These collections are only available...
We investigate the subtle cues to user identity that may be exploited in attacks on the privacy of users in web search query logs. We study the application of simple classifiers ...
In recent years, active learning methods based on experimental design achieve state-of-the-art performance in text classification applications. Although these methods can exploit ...
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...