Abstract. In this paper, we present a framework for segmenting the news programs into different story topics. The proposed method utilizes both visual and text information of the v...
The scope of this paper is the interpretation of a user's intention via a video camera and a speech recognizer. In comparison to previous work which only takes into account g...
In this paper, we focus on the use of three different techniques that support automatic derivation of video content from raw video data, namely, a spatio-temporal rule-based metho...
In this paper, we present an approach for speaker change detection in broadcast video using joint audio-visual scene change statistics. Our experiments indicate that using joint a...
We introduce a novel and inexpensive approach for the temporal alignment of speech to highly imperfect transcripts from automatic speech recognition (ASR). Transcripts are generat...