In the field of computer analysis of document images, the problems of physical and logical layout analysis have been approached through a variety of heuristic, rule-based, and gr...
The two main challenges typically associated with mining data streams are concept drift and data contamination. To address these challenges, we seek learning techniques and models ...
Abstract. The problem of classification from positive and unlabeled examples attracts much attention currently. However, when the number of unlabeled negative examples is very sma...
Xiaoling Wang, Zhen Xu, Chaofeng Sha, Martin Ester...
In this paper, we experimentally evaluated the effect of outlier detection methods to improve the prediction performance of fault-proneness models. Detected outliers were removed ...
We describe a very simple technique for discriminatively training a spam filter. Our results on the TREC Enron spam corpus would have been the best for the Ham at .1% measure, and...