The Dempster-Shafer Theory of Evidence is an established method for combining different sources of information. In this paper we explore ways to improve the combination performanc...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
Identity documents, such as ID cards, passports, and driver's licenses, contain textual information, a portrait of the legitimate holder, and eventually some other biometric ...
Justin Picard, Claus Vielhauer, Niels J. Thorwirth
Although electronic media has changed how people interact with documents, today’s electronic documents and the environments in which they are used are still impoverished relativ...
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper sh...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...