We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party con...
Alex Stupakov, Evan Hanusa, Jeff A. Bilmes, Dieter...
In this paper we develop a confidence measure that can determine if a given set of samples is suitable for inclusion in the reconstruction of a higher resolution dataset. The con...
From an audio perspective, the present state of teleconferencing technology leaves something to be desired; speaker overlap is one of the causes of this inadequate performance. To...
In many pattern recognition tasks, given some input data and a family of models, the “best” model is defined as the one which maximizes the likelihood of the data given the m...
Tara N. Sainath, Dimitri Kanevsky, Bhuvana Ramabha...
Past research on automatic laughter detection has focused mainly on audio-based detection. Here we present an audiovisual approach to distinguishing laughter from speech and we sh...