Feature selection is the task of choosing a small set out of a given set of features that capture the relevant properties of the data. In the context of supervised classification ...
We propose a multiclass (MC) classification approach to text categorization (TC). To fully take advantage of both positive and negative training examples, a maximal figure-of-meri...
Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
We propose a fast iterative classification algorithm for Kernel Fisher Discriminant (KFD) using heterogeneous kernel models. In contrast with the standard KFD that requires the us...
We present a trainable sequential-inference technique for processes with large state and observation spaces and relational structure. Our method assumes "reliable observation...
A critical problem in cluster ensemble research is how to combine multiple clusterings to yield a final superior clustering result. Leveraging advanced graph partitioning techniqu...
In this paper we extend previous results providing a theoretical analysis of a new Monte Carlo ensemble classifier. The framework allows us to characterize the conditions under wh...
The majority of the existing algorithms for learning decision trees are greedy--a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, ...
Principal component analysis (PCA) is a widely used statistical technique for unsupervised dimension reduction. K-means clustering is a commonly used data clustering for unsupervi...