Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
Scarcity and infeasibility of human supervision for large
scale multi-class classification problems necessitates active
learning. Unfortunately, existing active learning methods
...
Prateek Jain (University of Texas at Austin), Ashi...
Active learning [1] is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their l...
In information retrieval, relevance of documents with respect to queries is usually judged by humans, and used in evaluation and/or learning of ranking functions. Previous work ha...
Jingfang Xu, Chuanliang Chen, Gu Xu, Hang Li, Elbi...
The ranking function used by search engines to order results is learned from labeled training data. Each training point is a (query, URL) pair that is labeled by a human judge who...
Rakesh Agrawal, Alan Halverson, Krishnaram Kenthap...