When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we...
We compare two statistical methods for identifying spam or junk electronic mail. Spam filters are classifiers which determine whether an email is junk or not. The proliferation ...
Modern relational database systems are beginning to support ad hoc queries on mining models. In this paper, we explore novel techniques for optimizing queries that apply mining mo...
Surajit Chaudhuri, Vivek R. Narasayya, Sunita Sara...
Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when lear...
In this paper we investigate a discriminative approach to feature weighting for topic identification using minimum classification error (MCE) training. Our approach learns featu...