Traditionally, object recognition is performed based solely on the appearance of the object. However, relevant information also exists in the scene surrounding the object. As supp...
Video-based recognition and prediction of a temporally extended activity can benefit from a detailed description of high-level expectations about the activity. Stochastic grammars...
Multimodal grammars provide an expressive formalism for multimodal integration and understanding. However, handcrafted multimodal grammars can be brittle with respect to unexpecte...
Many computer vision systems try to infer semantic information about a video scene content by looking at the time series of the silhouettes of the moving objects. This paper propo...
We investigate a biologically motivated approach to fast visual classification, directly inspired by the recent work [13]. Specifically, trading-off biological accuracy for comput...