Sciweavers

SIGIR
2006
ACM

Spoken document retrieval from call-center conversations

14 years 5 months ago
Spoken document retrieval from call-center conversations
We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording quality, which makes automatic speech recognition (ASR) a highly difficult task. For typical call-center data, even state-of-the-art large vocabulary continuous speech recognition systems produce a transcript with word error rate of 30% or higher. In addition to the output transcript, advanced systems provide word confusion networks (WCNs), a compact representation of word lattices associating each word hypothesis with its posterior probability. Our work exploits the information provided by WCNs in order to improve retrieval performance. In this paper, we show that the mean average precision (MAP) is improved using WCNs compared to the raw word transcripts. Finally, we analyze the effect of increasing ASR word error rate on search effectiveness. We show that MAP is still reasonable even under extremely high ...
Jonathan Mamou, David Carmel, Ron Hoory
Added 14 Jun 2010
Updated 14 Jun 2010
Type Conference
Year 2006
Where SIGIR
Authors Jonathan Mamou, David Carmel, Ron Hoory
Comments (0)