This paper deals with content-based image retrieval. When the user is looking for large categories, statistical classification techniques are efficient as soon as the training se...
Matthieu Cord, Philippe Henri Gosselin, Sylvie Phi...
This paper shows how a text classifier's need for labeled training documents can be reduced by taking advantage of a large pool of unlabeled documents. We modify the Query-by...
Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
We present an active learning framework to simultaneously learn appearance and contextual models for scene understanding tasks (multi-class classification). Existing multi-class a...
For many types of machine learning algorithms, one can compute the statistically optimal" way to select training data. In this paper, we review how optimal data selection tec...
David A. Cohn, Zoubin Ghahramani, Michael I. Jorda...