This paper presents a framework for building rule-based image retrieval (RBIR) systems. Soft computing based multimedia data mining techniques are employed to extract and optimize...
This paper presents a spoken document summarization scheme using acoustic, prosodic and semantic information. First, speech recognition confidence is estimated to choose reliable ...
Music visualization provides users with a new interface to browse, search, and navigate their personal digital music collection. Although there are several previous works on visua...
Virtual avatars in many applications are constructed manually or by a single speech-driven model which needs a lot of training data and long training time. It’s an essential pro...
In this paper, a prescription-based error concealment (PEC) method is proposed. PEC relies on pre-analyses of the concealment error image (CEI) for I-frames and the optimal error ...
Wen-Nung Lie, Tom C.-I. Lin, Dung-Chan Tsai, Guo-S...
We describe a novel technique for multi-sensory speech processing for enhancing noisy speech and for improved noiserobust speech recognition. Both air- and bone-conductive microph...
Amarnag Subramanya, Li Deng, Zicheng Liu, Zhengyou...
This paper presents our latest work on identifying frame content types for understanding learning media content. In particular, we categorize frames into six classes namely, slide...
A system for retrieving video captured in a ubiquitous environment is presented. Data from pressure-based floor sensors are obtained as a supplementary input together with video f...
Gamhewage C. de Silva, Toshihiko Yamasaki, Takayuk...