This paper examines tagging models for spontaneous English speech transcripts. We analyze the performance of state-of-the-art tagging models, either generative or discriminative, ...
Vladimir Eidelman, Zhongqiang Huang, Mary P. Harpe...
In the domain of candidly-captured student presentation videos, we examine and evaluate approaches for multimodal analysis and indexing of audio and video. We apply visual segment...
We investigate methods of segmenting, visualizing, and indexing presentation videos by both audio and visual data. The audio track is segmented by speaker, and augmented with key ...
With the rapidly growing use of the audio and multimedia information over the Internet, the technology for retrieving speech information using voice queries is becoming more and mo...