This paper studies the effects of training data on binary text classification and postulates that negative training data is not needed and may even be harmful for the task. Tradit...
: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...
Semi-supervised learning methods construct classifiers using both labeled and unlabeled training data samples. While unlabeled data samples can help to improve the accuracy of trai...
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
We introduce a novel active learning algorithm for classification of network data. In this setting, training instances are connected by a set of links to form a network, the label...