We consider the multi-class classification problem, based on vector observation sequences, where the conditional (given class observations) probability distributions for each class as well as the unconditional probability distribution of the observations are unknown. We develop a novel formulation that combines training with the quality of classification that can be obtained using the 'learned' (via training) models. The parametric models we use are finite mixture models, where the same component densities are used in the model for each class, albeit with different mixture weights. Thus we use a model known as All-Class-One-Network (ACON) model in the neural network literature. We argue why this is a more appropriate model for context-dependent classification, as is common in bioinformatics. We derive rigorously the solution to this joint optimization problem. A key step in our approach is to consider a tight (provably) bound between the average Bayes error (the true minimal ...
Alex S. Baras, John S. Baras