Training of error-corrective model for ASR without using audio data

14 years 10 months ago

Download mirlab.org

This paper introduces a method to train an error-corrective model for Automatic Speech Recognition (ASR) without using audio data. In existing techniques, it is assumed that sufﬁcient audio data of the target application is available and negative samples can be prepared by having ASR recognize this audio data. However, this assumption is not always true. We propose generating probable N-best lists, which the ASR may produce, directly from the text data of the target application by taking phoneme similarity into consideration. We call this process “Pseudo-ASR”. We conduct discriminative reranking with the error-corrective model by regarding the text data as positive samples and the N-best lists from the Pseudo-ASR as negative samples. Experiments with Japanese call center data showed that discriminative reranking based on the Pseudo-ASR improved the accuracy of the ASR.

Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura

Real-time Traffic

Audio Data | Error-corrective Model | ICASSP 2011 | Signal Processing | Target Application |

claim paper

» The EPAC Corpus Manual and Automatic Annotations of Conversational Speech in French Broadc...

» A framework for classification and segmentation of massive audio data streams

» Improving acoustic event detection using generalizable visual features and multimodality m...

» A variational Bayesian methodology for hidden Markov models utilizing Studentst mixtures

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura

Comments (0)

Sciweavers

Training of error-corrective model for ASR without using audio data

Audio Data | Error-corrective Model | ICASSP 2011 | Signal Processing | Target Application |

Explore & Download

Productivity Tools

Sciweavers