In this paper we evaluate the effect of the emotional state of a speaker when text-independent speaker identification is performed. The spectral features used for speaker recogni...
Marius Vasile Ghiurcau, Corneliu Rusu, Jaakko Asto...
—We propose a statistical framework for high-level feature extraction that uses SIFT Gaussian mixture models (GMMs) and audio models. SIFT features were extracted from all the im...
The popular mel-frequency cepstral coefficients (MFCCs) capture a mixture of speaker-related, phonemic and channel information. Speaker-related information could be further broke...