Sciweavers

MLMI
2007
Springer

Modeling Vocal Interaction for Segmentation in Meeting Recognition

14 years 5 months ago
Modeling Vocal Interaction for Segmentation in Meeting Recognition
Automatic segmentation is an important technology for both automatic speech recognition and automatic speech understanding. In meetings, participants typically vocalize for only a fraction of the recorded time, but standard vocal activity detection algorithms for close-talk microphones in meetings continue to treat participants independently. In this work we present a multispeaker segmentation system which models a particular aspect of human-human communication, that of vocal interaction or the interdependence between participants’ on-off speech patterns. We describe our vocal interaction model, its training, and its use during vocal activity decoding. Our experiments show that this approach almost completely eliminates the problem of crosstalk, and word error rates on our development set are lower than those obtained with human-generatated reference segmentation. We also observe significant performance improvements on unseen data.
Kornel Laskowski, Tanja Schultz
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where MLMI
Authors Kornel Laskowski, Tanja Schultz
Comments (0)