Multimodal model integration for sentence unit detection

14 years 8 months ago

Download www.speech.sri.com

In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classiﬁcation task such that for each inter-word boundary, the classiﬁer decides whether there is an SU boundary or not. In addition to gesture cues, we also utilize prosody and lexical knowledge sources. In this ﬁrst investigation, we ﬁnd that gesture features complement the prosodic and lexical knowledge sources for this task. By using all of the knowledge sources, the model is able to achieve the lowest overall SU detection error rate. Categories and Subject Descriptors: H.5.1 [Multimedia Information Systems] Audio and Video Input, H.5.5 [Sound and Music Computing] Modeling and Signal Analysis, I.2.7 [Natural Language Processing] Dialog Processing General Terms: Algorithms, Performance, Experimentation, Languages.

Mary P. Harper, Elizabeth Shriberg

Real-time Traffic

Gesture Cues | ICMI 2004 | Knowledge Sources | Lexical Knowledge Sources |

claim paper

Post Info
More Details (n/a)

Added	01 Jul 2010
Updated	01 Jul 2010
Type	Conference
Year	2004
Where	ICMI
Authors	Mary P. Harper, Elizabeth Shriberg

Comments (0)

Sciweavers

Multimodal model integration for sentence unit detection

Gesture Cues | ICMI 2004 | Knowledge Sources | Lexical Knowledge Sources |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers