Speech recognition is usually based on Hidden Markov Models (HMMs), which represent the temporal dynamics of speech very efficiently, and Gaussian mixture models, which do non-optimally the classification of speech into single speech units (phonemes). In this paper we use parallel mixtures of Support Vector Machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. SVMs are very appealing due to their association with statistical learning theory and have already shown good results in pattern recognition and in continuous speech recognition. They suffer however from the effort for training which scales at least quadratic with respect to the number of training vectors. The SVM mixtures need only nearly linear training time making it easier to deal with the large amount of speech data. In our hybrid system we use the SVM mixtures as acoustic models in a HMM-based decoder. We train and test the hybrid system on the DARPA Resource Management (RM...
Sven E. Krüger, Martin Schafföner, Marce