In order to solve medical multimodal queries, we propose to split the queries in different dimensions using ontology. We extract both textual and visual terms depending on the ont...
Event detection is of great importance in high-level semantic indexing and selective browsing of video clips. However, the use of low-level visual-audio feature descriptors alone ...
Shu-Ching Chen, Min Chen, Chengcui Zhang, Mei-Ling...
We present a framework to synchronize pop music to corresponding text lyric. We refine line level alignment achievable by existing work to syllabic level by using a dynamic progra...
Automatic text detection in video is an important task for efficient and accurate indexing and retrieval of multimedia data such as events identification, events boundary identific...
Shivakumara Palaiahnakote, Trung Quy Phan, Chew-Li...
This paper presents a spoken document summarization scheme using acoustic, prosodic and semantic information. First, speech recognition confidence is estimated to choose reliable ...