The appearance of non-rigid objects detected and tracked in video streams is highly variable and therefore makes the identification of similar objects very complex. Furthermore, i...
There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investi...
Visual vocabulary serves as a fundamental component in many computer vision tasks, such as object recognition, visual search, and scene modeling. While state-of-the-art approaches...
Eye gaze and gesture form key conversational grounding cues that are used extensively in face-to-face interaction among people. To accurately recognize visual feedback during inter...
We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in ap...
Minyoung Kim, Sanjiv Kumar, Vladimir Pavlovic, Hen...