Online detection of vocal Listener Responses with maximum latency constraints

13 years 8 months ago

Download mirlab.org

When human listeners utter Listener Responses (e.g. back-channels or acknowledgments) such as ‘yeah’ and ‘mmhmm’, interlocutors commonly continue to speak or resume their speech even before the listener has ﬁnished his/her response. This type of speech interactivity results in frequent speech overlap which is common in humanhuman conversation. To allow for this type of speech interactivity to occur between humans and spoken dialog systems, which will result in more human-like continuous and smoother human-machine interaction, we propose an on-line classiﬁer which can classify incoming speech as Listener Responses. We show that it is possible to detect vocal Listener Responses using maximum latency thresholds of 100-500 ms, thereby obtaining equal error rates ranging from 34% to 28% by using an energy based voice activity detector.

Daniel Neiberg, Khiet P. Truong

Real-time Traffic

Human Listeners | Human Machine Interaction | ICASSP 2011 | Signal Processing | Voice Activity Detector |

claim paper

Post Info
More Details (n/a)

Added	29 Aug 2011
Updated	29 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Daniel Neiberg, Khiet P. Truong

Comments (0)

Sciweavers

Online detection of vocal Listener Responses with maximum latency constraints

Human Listeners | Human Machine Interaction | ICASSP 2011 | Signal Processing | Voice Activity Detector |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers