Scarcity and infeasibility of human supervision for large
scale multi-class classification problems necessitates active
learning. Unfortunately, existing active learning methods
for multi-class problems are inherently binary methods and
do not scale up to a large number of classes. In this paper,
we introduce a probabilistic variant of the K-Nearest
Neighbor method for classification that can be seamlessly
used for active learning in multi-class scenarios. Given
some labeled training data, our method learns an accurate
metric/kernel function over the input space that can
be used for classification and similarity search. Unlike existing
metric/kernel learning methods, our scheme is highly
scalable for classification problems and provides a natural
notion of uncertainty over class labels. Further, we
use this measure of uncertainty to actively sample training
examples that maximize discriminating capabilities of the
model. Experiments on benchmark datasets show that the
...