Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

15 years 5 months ago

Download www.icpr2010.org

Pitch mismatch between training and testing is one of the important factors causing the performance degradation of the speaker recognition system. In this paper, we adopted the missing feature theory and specified the Unreliable Region (UR) as the parts of the utterance with high emotioninduced pitch variation. To model these regions, a virtual HD (High Different from neutral, with large pitch offset) model for each target speaker was built from the virtual speech, which were converted from the neutral speech by the Pitch Transformation Algorithm (PTA). In the PTA, a polynomial transformation function was learned to model the relationship of the average pitch between the neutral and the high-pitched utterances. Compared with traditional GMM-UBM and our

Ting Huang, Yingchun Yang

Real-time Traffic

Computer Vision | ICPR 2010 | Pitch Mismatch | Pitch Transformation Algorithm | Polynomial Transformation Function |

claim paper

Post Info
More Details (n/a)

Added	12 Feb 2011
Updated	12 Feb 2011
Type	Journal
Year	2010
Where	ICPR
Authors	Ting Huang, Yingchun Yang

Comments (0)

Sciweavers

Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

Computer Vision | ICPR 2010 | Pitch Mismatch | Pitch Transformation Algorithm | Polynomial Transformation Function |

Explore & Download

Productivity Tools

Sciweavers