In this paper we describe an approach that uses a combination of visual and audio features to cluster shots belonging to the same person together in video programs. We use color h...
We present a method for visual classification of actions and events captured from an egocentric point of view. The method tackles the challenge of a moving camera by creating defor...
Most of the current digital cameras feature a single sensor design which limits the number of channels recorded at each pixel location to one. However, a color image is represente...
Computational models of grounded language learning have been based on the premise that words and concepts are learned simultaneously. Given the mounting cognitive evidence for conc...
A large number of the world's cultural heritage sites and landscapes have been lost over time due to the progress of urbanization. Digital archive projects that digitize these...