We introduce a computational model of sensor fusion based on the topographic representations of a ”two-microphone and one camera” configuration. Our aim is to perform a robust...
Abstract. Prosody has been actively studied as an important knowledge source for speech recognition and understanding. In this paper, we are concerned with the question of exploiti...
Extractive text summarization aims to create a condensed version of one or more source documents by selecting the most informative sentences. Research in text summarization has th...
In this paper, a motion-based approach for detecting highlevel semantic events in video sequences is presented. Its main characteristic is its generic nature, i.e. it can be direc...
In this paper, we propose a framework that fuses multiple features for improved action recognition in videos. The fusion of multiple features is important for recognizing actions ...