Jointly recognizing multi-speaker conversations

14 years 24 days ago

Download ssli.ee.washington.edu

We suggest an approach to speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We moreover introduce a practical implementation of this approach that demonstrates both language model perplexity and speech recognition word error rate improvements in conversational telephone speech. Speciﬁcally, we show that such beneﬁts can be had if a n-gram language model, in addition to conditioning on immediately preceding words in an utterance, is also allowed to condition on the estimated dialog-act of the immediately preceding utterance of an alternate speaker.

Gang Ji, Jeff Bilmes

Real-time Traffic

Conversational Telephone Speech | ICASSP 2010 | Language Model | Signal Processing | Speech Recognition Word |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Gang Ji, Jeff Bilmes

Comments (0)

Sciweavers

Jointly recognizing multi-speaker conversations

Conversational Telephone Speech | ICASSP 2010 | Language Model | Signal Processing | Speech Recognition Word |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers