An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems

14 years 6 months ago

Download www.cs.cmu.edu

As spoken dialogue systems become deployed in increasingly complex domains, they face rising demands on the naturalness of interaction. We focus on system responsiveness, aiming to mimic human-like dialogue ﬂow control by predicting speaker changes as observed in real human-human conversations. We derive an instantaneous vector representation of pitch variation and show that it is amenable to standard acoustic modeling techniques. Using a small amount of automatically labeled data, we train models which significantly outperform current state-of-the-art pause-only systems, and replicate to within 1% absolute the performance of our previously published hand-crafted baseline. The new system additionally offers scope for run-time control over the precision or recall of locations at which to speak.

Kornel Laskowski, Jens Edlund, Mattias Heldner

Real-time Traffic

ICASSP 2008 | Mimic Human-like Dialogue | Real Human-human Conversations | Signal Processing | Spoken Dialogue Systems |

claim paper

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Kornel Laskowski, Jens Edlund, Mattias Heldner

Comments (0)

Sciweavers

An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems

ICASSP 2008 | Mimic Human-like Dialogue | Real Human-human Conversations | Signal Processing | Spoken Dialogue Systems |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers