The automatic transcription of broadcast news and meetings involves the segmentation, identification and tracking of speaker turns during each session, which is known as speaker di...
My thesis aims to contribute towards building autonomous agents that are able to understand their surrounding environment through the use of both audio and visual information. To ...
The Lexical Access Problem consists of determining the intended sequence of words corresponding to an input sequence of phonemes (basic speech sounds) that come from a low-level p...
Ian E. Thomas, Ingrid Zukerman, Jonathan J. Oliver...
We describe the ICSI-SRI-UW team’s entry in the Spring 2004 NIST Meeting Recognition Evaluation. The system was derived from SRI’s 5xRT Conversational Telephone Speech (CTS) r...
Chuck Wooters, Nikki Mirghafori, Andreas Stolcke, ...
Long-span language models that capture syntax and semantics are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive sea...
Anoop Deoras, Tomas Mikolov, Stefan Kombrink, Mart...