— In a previous paper [3], we proposed a new way to achieve visual servoing. Rather than minimizing the error between the position of two set of geometric features, we proposed t...
Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...