Language Modeling Approach for Retrieving Passages in Lecture Audio Data

15 years 8 months ago

Download www.lrec-conf.org

Spoken Document Retrieval (SDR) is a promising technology for enhancing the utility of spoken materials. After the spoken documents have been transcribed by using a Large Vocabulary Continuous Speech Recognition (LVCSR) decoder, a text-based ad hoc retrieval method can be applied directly to the transcribed documents. However, recognition errors will significantly degrade the retrieval performance. To address this problem, we have previously proposed a method that aimed to fill the gap between automatically transcribed text and correctly transcribed text by using a statistical translation technique. In this paper, we extend the method by (1) using neighboring context to index the target passage, and (2) applying a language modeling approach for document retrieval. Our experimental evaluation shows that context information can improve retrieval performance, and that the language modeling approach is effective in incorporating context information into the proposed SDR method, which uses...

Koichiro Honda, Tomoyosi Akiba

Real-time Traffic

Document Retrieval | Education | LREC 2010 | Spoken Document Retrieval | Spoken Documents |

claim paper

» Information Retrieval Baselines for the ResPubliQA Task

» Facilitating SpatioTemporal Operations in a Versatile Video Database System

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Koichiro Honda, Tomoyosi Akiba

Comments (0)

Sciweavers

Language Modeling Approach for Retrieving Passages in Lecture Audio Data

Document Retrieval | Education | LREC 2010 | Spoken Document Retrieval | Spoken Documents |

Explore & Download

Productivity Tools

Sciweavers