Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents

13 years 7 months ago

Download www.mmk.ei.tum.de

Abstract. Systems for keyword and non-linguistic vocalization detection in conversational agent applications need to be robust with respect to background noise and diﬀerent speaking styles. Focussing on the Sensitive Artiﬁcial Listener (SAL) scenario which involves spontaneous, emotionally colored speech, this paper proposes a multi-stream model that applies the principle of Long Short-Term Memory to generate contextsensitive phoneme predictions which can be used for keyword detection. Further, we investigate the incorporation of noisy training material in order to create noise robust acoustic models. We show that both strategies can improve recognition performance when evaluated on spontaneous human-machine conversations as contained in the SEMAINE database.

Martin Wöllmer, Erik Marchi, Stefano Squartin

Real-time Traffic

ISNN 2011 | Machine Conversations | Neural Networks | Noisy Training | Short Term Memory |

claim paper

Post Info
More Details (n/a)

Added	15 Sep 2011
Updated	15 Sep 2011
Type	Journal
Year	2011
Where	ISNN
Authors	Martin Wöllmer, Erik Marchi, Stefano Squartini, Björn Schuller

Comments (0)

Sciweavers

Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents

ISNN 2011 | Machine Conversations | Neural Networks | Noisy Training | Short Term Memory |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers