Many images--especially those used for page design on web pages--as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized auto...
This paper presents a novel approach for content-based analysis of karaoke music, which utilizes multimodal contents including synchronized lyrics text from the video channel and ...
In this paper we present a general framework for object detection and segmentation. Using a bottom-up unsupervised merging algorithm, a region-based hierarchy that represents the ...
We introduce a novel and inexpensive approach for the temporal alignment of speech to highly imperfect transcripts from automatic speech recognition (ASR). Transcripts are generat...
The automatic extraction of sports video highlights is a typical kind of personalized media production process. Many ways have been studied from the viewpoints of lowlevel audio/v...