We present the IBM systems for the Rich Transcription 2007 (RT07) speaker diarization evaluation task on lecture meeting data. We first overview our baseline system that was devel...
Abstract. This work describes the development of an automatic estimator of perceptual femininity (PF) of an input utterance using speaker verification techniques. The estimator wa...
With the growing popularity of digitized sports video, automatic analysis of them need be processed to facilitate semantic summarization and retrieval. Playfield plays the fundame...
A key problem for "face in the crowd" recognition from existing surveillance cameras in public spaces (such as mass transit centres) is the issue of pose mismatches betwe...
High-dimensional index is one of the most challenging tasks for content-based video retrieval (CBVR). Typically, in video database, there exist two kinds of clues for query: visual...
Zhiping Shi, Qingyong Li, Zhiwei Shi, Zhongzhi Shi