State-of-the-art speaker diarization systems for meetings are now at a point where overlapped speech contributes significantly to the errors made by the system. However, little i...
Kofi Boakye, B. Trueba-Hornero, Oriol Vinyals, Ger...
In this paper we describe the AMIDA speaker dizarization system as it was submitted to the NIST Rich Transcription evaluation 2007 for conference room data. This is done in the con...
We present the IBM systems for the Rich Transcription 2007 (RT07) speaker diarization evaluation task on lecture meeting data. We first overview our baseline system that was devel...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non...