Sciweavers

EMNLP
2007

A Statistical Language Modeling Approach to Lattice-Based Spoken Document Retrieval

14 years 1 months ago
A Statistical Language Modeling Approach to Lattice-Based Spoken Document Retrieval
Speech recognition transcripts are far from perfect; they are not of sufficient quality to be useful on their own for spoken document retrieval. This is especially the case for conversational speech. Recent efforts have tried to overcome this issue by using statistics from speech lattices instead of only the 1best transcripts; however, these efforts have invariably used the classical vector space retrieval model. This paper presents a novel approach to lattice-based spoken document retrieval using statistical language models: a statistical model is estimated for each document, and probabilities derived from the document models are directly used to measure relevance. Experimental results show that the lattice-based language modeling method outperforms both the language modeling retrieval method using only the 1-best transcripts, as well as a recently proposed lattice-based vector space retrieval method.
Tee Kiah Chia, Haizhou Li, Hwee Tou Ng
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors Tee Kiah Chia, Haizhou Li, Hwee Tou Ng
Comments (0)