We approached the problem of classifying papers for the TREC 2004 Genomics Track triage task as a four step process: feature generation, feature selection, classifier training, an...
Aaron M. Cohen, Ravi Teja Bhupatiraju, William R. ...
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Quantization of continuous variables is important in data analysis, especially for some model classes such as Bayesian networks and decision trees, which use discrete variables. Of...
Class membership probability estimates are important for many applications of data mining in which classification outputs are combined with other sources of information for decisi...
This paper discusses building complex classifiers from a single labeled example and vast number of unlabeled observation sets, each derived from observation of a single process or...