We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists o...
The compositional nature of visual objects significantly limits their representation complexity and renders learning of structured object models tractable. Adopting this modeling ...
Local spatiotemporal features or interest points provide compact but descriptive representations for efficient video analysis and motion recognition. Current local feature extract...
An effective object recognition scheme is to represent and match images on the basis of histograms derived from photometric color invariants. A drawback, however, is that certain c...
A comprehensive novel multi-view dynamic face model is presented in this paper to address two challenging problems in face recognition and facial analysis: modelling faces with la...