Deep Belief Networks using discriminative features for phone recognition

13 years 6 months ago

Download www.cs.toronto.edu

Deep Belief Networks (DBNs) are multi-layer generative models. They can be trained to model windows of coefﬁcients extracted from speech and they discover multiple layers of features that capture the higher-order statistical structure of the data. These features can be used to initialize the hidden units of a feed-forward neural network that is then trained to predict the HMM state for the central frame of the window. Initializing with features that are good at generating speech makes the neural network perform much better than initializing with random weights. DBNs have already been used successfully for phone recognition with input coefﬁcients that are MFCCs or ﬁlterbank outputs [1, 2]. In this paper, we demonstrate that they work even better when their inputs are speaker adaptive, discriminative features. On the standard TIMIT corpus, they give phone error rates of 19.6% using monophone HMMs and a bigram language model and 19.4% using monophone HMMs and a trigram language mod...

Abdel-rahman Mohamed, Tara N. Sainath, George Dahl

Real-time Traffic

ICASSP 2011 | Language Model | Monophone Hmms | Neural Network | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Abdel-rahman Mohamed, Tara N. Sainath, George Dahl, Bhuvana Ramabhadran, Geoffrey E. Hinton, Michael A. Picheny

Comments (0)

Sciweavers

Deep Belief Networks using discriminative features for phone recognition

ICASSP 2011 | Language Model | Monophone Hmms | Neural Network | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers