Active learning strategies can be useful when manual labeling
effort is scarce, as they select the most informative
examples to be annotated first. However, for visual category
learning, the active selection problem is particularly
complex: a single image will typically contain multiple object
labels, and an annotator could provide multiple types
of annotation (e.g., class labels, bounding boxes, segmentations),
any of which would incur a variable amount of
manual effort. We present an active learning framework
that predicts the tradeoff between the effort and information
gain associated with a candidate image annotation, thereby
ranking unlabeled and partially labeled images according
to their expected “net worth” to an object recognition system.
We develop a multi-label multiple-instance approach
that accommodates multi-object images and a mixture of
strong and weak labels. Since the annotation cost can vary
depending on an image’s complexity, we show how to imp...