Person identification using audio (speech) and visual (facial appearance, static or dynamic) modalities, either independently or jointly, is a thoroughly investigated problem in pa...
After two successful years at SIGIR in 2007 and 2008, the third workshop on Searching Spontaneous Conversational Speech (SSCS 2009) was held conjunction with the ACM Multimedia 20...
Martha Larson, Roeland Ordelman, Franciska de Jong...
This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audiovisual features. Visual features are known to improve ...
Abstract--This paper is concerned with the automatic recognition of dialogue acts (DAs) in multiparty conversational speech. We present a joint generative model for DA recognition ...
In this paper, we revisit the linear transformation for VTLN on conventional MFCC proposed by Sanand et al. in [1], using the idea of band-limited interpolation. The filter-bank i...