In large vocabulary continuous speech recognition, decision trees are widely used to cluster triphone states. In addition to commonly used phonetically based questions, others hav...
Hank Liao, Christopher Alberti, Michiel Bacchiani,...
Epoch is the instant of significant excitation of the vocal-tract system during production of speech. For most voiced speech, the most significant excitation takes place around the...
A novel Statistical Approach for F0 Estimation, SAFE, is proposed to improve the accuracy of F0 tracking under both clean and additive noise conditions. Prominent Signal-to-Noise ...
This paper presents a novel application of speech emotion recognition: estimation of the level of conversational engagement between users of a voice communication system. We begin...
In this paper we describe a corpus set together from two sub-corpora. The CINEMO corpus contains acted emotional expression obtained by playing dubbing exercises. This new protoco...