Active learning may hold the key for solving the data scarcity problem in supervised learning, i.e., the lack of labeled data. Indeed, labeling data is a costly process, yet an active learner may request labels of only selected instances, thus reducing labeling work dramatically. Most previous works of active learning are, however, pool-based; that is, a pool of unlabelled examples is given and the learner can only select examples from the pool to query for their labels. This type of active learning has several weaknesses. In this paper we propose novel active learning algorithms that construct examples directly to query for labels. We study both a specific active learner based on the decision tree algorithm, and a general active learner that can work with any base learning algorithm. As there is no restriction on what examples to be queried, our methods are shown to often query fewer examples to reduce the predictive error quickly. This casts doubt on the usefulness of the pool in po...
Charles X. Ling, Jun Du