Typical broadcast material contains not only studio-recorded texts read by trained speakers, but also spontaneous and dialect speech, debates with cross-talk, voice-overs, and on-...
Doris Baum, Daniel Schneider, Rolf Bardeli, Jochen...
We propose a new two-stage framework for joint analysis of head gesture and speech prosody patterns of a speaker toward automatic realistic synthesis of head gestures from speech p...
Relationships that link static documents discussed during meetings to the corresponding speech transcripts can be of various kinds. The most important ones, thematic links, quotat...
The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation m...
Monaural speech segregation in reverberant environments is a very difficult problem. We develop a supervised learning approach by proposing an objective function that directly rel...