It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domai...
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform class...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Many text documents naturally have two kinds of labels. For example, we may label web pages from universities according to their categories, such as "student" or "fa...
In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the...