In the context of binary classification, we define disagreement as a measure of how often two independently-trained models differ in their classification of unlabeled data. We exp...
We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. In this framework, we motivate minimum entropy regulariza...
The main problems in text classification are lack of labeled data, as well as the cost of labeling the unlabeled data. We address these problems by exploring co-training - an algo...
Using unlabeled data to help supervised learning has become an increasingly attractive methodology and proven to be effective in many applications. This paper applies semi-supervi...
Current Named Entity Recognition systems suffer from the lack of hand-tagged data as well as degradation when moving to other domain. This paper explores two aspects: the automati...
This paper proposes a semi-supervised boosting approach to improve statistical word alignment with limited labeled data and large amounts of unlabeled data. The proposed approach ...
We present a novel framework for multi-label learning that explicitly addresses the challenge arising from the large number of classes and a small size of training data. The key a...
Feature selection is an important task in effective data mining. A new challenge to feature selection is the so-called “small labeled-sample problem” in which labeled data is...
We show how to use unlabeled data and a deep belief net (DBN) to learn a good covariance kernel for a Gaussian process. We first learn a deep generative model of the unlabeled da...
Semi-supervised methods use unlabeled data in addition to labeled data to construct predictors. While existing semi-supervised methods have shown some promising empirical performa...