Analysis-by-synthesis features for speech recognition

14 years 9 months ago

Download www.cs.cmu.edu

We present a framework for speech recognition that accounts for hidden articulatory information. We model the articulatory space using a codebook of articulatory conﬁgurations geometrically derived from EMA measurements available in the MOCHA database. The articulatory parameter set we derive is in the form of Maeda parameters. In turn, these parameters are used in a physiologicallymotivated articulatory speech synthesizer based on the model by Sondhi and Schroeter. We use the distortion between the speech synthesized from each of the articulatory conﬁgurations and the original speech as features for recognition. We setup a segmented phoneme recognition task on the MOCHA database using Gaussian mixture models (GMMs). Improvements are achieved when combining the probability scores generated using the distortion features with the scores using acoustic features.

Ziad Al Bawab, Bhiksha Raj, Richard M. Stern

Real-time Traffic

Articulatory Conﬁgurations | Hidden Articulatory Information | ICASSP 2008 | Physiologicallymotivated Articulatory Speech | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Ziad Al Bawab, Bhiksha Raj, Richard M. Stern

Comments (0)

Sciweavers

Analysis-by-synthesis features for speech recognition

Articulatory Conﬁgurations | Hidden Articulatory Information | ICASSP 2008 | Physiologicallymotivated Articulatory Speech | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers