Modeling the intonation of discourse segments for improved online dialog ACT tagging

14 years 7 months ago

Download sail.usc.edu

Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acousticprosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through acoustic correlates of prosody. We also propose a discriminative framework that exploits preceding context in the form of lexical and prosodic cues from previous discourse segments. Such a scheme facilitates online DA tagging and offers robustness in the decoding process, unlike greedy decoding schemes that can potentially propagate errors. Using only lexical and prosodic cues from 3 previous utterances, we achieve a DA tagging accuracy of 72% compared to the best case scenario with accurate knowledge of previous DA tag, which results in 74% accuracy.

Vivek Kumar Rangarajan Sridhar, Shrikanth Narayana

Real-time Traffic

DA Tagging Accuracy | Dialog Act | ICASSP 2008 | Prosodic Cues | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Vivek Kumar Rangarajan Sridhar, Shrikanth Narayanan, Srinivas Bangalore

Comments (0)

Sciweavers

Modeling the intonation of discourse segments for improved online dialog ACT tagging

DA Tagging Accuracy | Dialog Act | ICASSP 2008 | Prosodic Cues | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers