Lattice-Based Search for Spoken Utterance Retrieval

14 years 3 months ago

Download acl.ldc.upenn.edu

Recent work on spoken document retrieval has suggested that it is adequate to take the singlebest output of ASR, and perform text retrieval on this output. This is reasonable enough for the task of retrieving broadcast news stories, where word error rates are relatively low, and the stories are long enough to contain much redundancy. But it is patently not reasonable if one's task is to retrieve a short snippet of speech in a domain where WER's can be as high as 50%; such would be the situation with teleconference speech, where one's task is to find if and when a participant uttered a certain phrase. In this paper we propose an indexing procedure for spoken utterance retrieval that works on lattices rather than just single-best text. We demonstrate that this procedure can improve F scores by over five points compared to singlebest retrieval on tasks with poor WER and low redundancy. The representation is flexible so that we can represent both word lattices, as well as p...

Murat Saraclar, Richard Sproat

Real-time Traffic

NAACL 2004 | NAACL 2007 | One's Task | Spoken Document Retrieval | Spoken Utterance Retrieval |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	NAACL
Authors	Murat Saraclar, Richard Sproat

Comments (0)

Sciweavers

Lattice-Based Search for Spoken Utterance Retrieval

NAACL 2004 | NAACL 2007 | One's Task | Spoken Document Retrieval | Spoken Utterance Retrieval |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers