Multi-view and multi-objective semi-supervised learning for large vocabulary continuous speech recognition

13 years 7 months ago

Download mirlab.org

Current hidden Markov acoustic modeling for large vocabulary continuous speech recognition (LVCSR) relies on the availability of abundant labeled transcriptions. Given that speech labeling is both expensive and time-consuming while there is a huge amount of unlabeled data easily available nowadays, semi-supervised learning (SSL) from both labeled and unlabeled data which aims to reduce the development cost for LVCSR becomes more important than ever. In this paper, we propose SSL for LVCSR by using the multiple views learned from different acoustic features and randomized decision trees. In addition, we develop the multi-objective learning of HMM-based acoustic models by optimizing a hybrid criterion which is established by the combination of the discriminative mutual information from labeled data and the entropy from unlabeled data. Experiments conducted on Broadcast News show the beneﬁts of proposed methods.

Xiaodong Cui, Jing Huang, Jen-Tzung Chien

Real-time Traffic

Abundant Labeled Transcriptions | ICASSP 2011 | Markov Acoustic Modeling | Signal Processing | Unlabeled Data |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Xiaodong Cui, Jing Huang, Jen-Tzung Chien

Comments (0)

Sciweavers

Multi-view and multi-objective semi-supervised learning for large vocabulary continuous speech recognition

Abundant Labeled Transcriptions | ICASSP 2011 | Markov Acoustic Modeling | Signal Processing | Unlabeled Data |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers