Sciweavers

368 search results - page 35 / 74
» Scene Determination Based on Video and Audio Features
Sort
View
TSD
2004
Springer
14 years 2 months ago
Multimodal Phoneme Recognition of Meeting Data
This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audiovisual features. Visual features are known to improve ...
Petr Motlícek, Jan Cernocký
FGR
2004
IEEE
126views Biometrics» more  FGR 2004»
14 years 24 days ago
Trainable Videorealistic Speech Animation
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she u...
Tony Ezzat, Gadi Geiger, Tomaso Poggio
ICIAP
2003
ACM
14 years 9 months ago
A real-time text-independent speaker identification system
The paper presents a real-time speaker identification system based on the analysis of the audio track of a video stream. The system has been employed in the context of automatic v...
Luigi P. Cordella, Pasquale Foggia, Carlo Sansone,...
ICIP
2004
IEEE
14 years 10 months ago
Detection of unique people in news programs using multimodal shot clustering
In this paper we describe an approach that uses a combination of visual and audio features to cluster shots belonging to the same person together in video programs. We use color h...
Alberto Albiol, Cüneyt M. Taskiran, Edward J....
IAAI
2003
13 years 10 months ago
Broadcast News Understanding and Navigation
The Broadcast News Editor (BNE) and Broadcast News Navigator (BNN) are fully implemented systems that exploit integrated image, speech, and language processing to support intellig...
Mark T. Maybury