Sciweavers

376 search results - page 41 / 76
» Analysis-by-synthesis features for speech recognition
Sort
View
ICPR
2010
IEEE
14 years 25 days ago
Crossmodal Matching of Speakers Using Lip and Voice Features in Temporally Non-Overlapping Audio and Video Streams
Person identification using audio (speech) and visual (facial appearance, static or dynamic) modalities, either independently or jointly, is a thoroughly investigated problem in pa...
Anindya Roy, Sebastien Marcel
SIGIR
2010
ACM
13 years 8 months ago
Multimedia with a speech track: searching spontaneous conversational speech
After two successful years at SIGIR in 2007 and 2008, the third workshop on Searching Spontaneous Conversational Speech (SSCS 2009) was held conjunction with the ACM Multimedia 20...
Martha Larson, Roeland Ordelman, Franciska de Jong...
TSD
2004
Springer
14 years 3 months ago
Multimodal Phoneme Recognition of Meeting Data
This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audiovisual features. Visual features are known to improve ...
Petr Motlícek, Jan Cernocký
TASLP
2008
115views more  TASLP 2008»
13 years 9 months ago
Recognition of Dialogue Acts in Multiparty Meetings Using a Switching DBN
Abstract--This paper is concerned with the automatic recognition of dialogue acts (DAs) in multiparty conversational speech. We present a joint generative model for DA recognition ...
Alfred Dielmann, Steve Renals
INTERSPEECH
2010
13 years 4 months ago
Revisiting VTLN using linear transformation on conventional MFCC
In this paper, we revisit the linear transformation for VTLN on conventional MFCC proposed by Sanand et al. in [1], using the idea of band-limited interpolation. The filter-bank i...
Doddipatla Rama Sanand, Ralf Schlüter, Herman...