Pronunciation modeling by sharing Gaussian densities across phonetic models

15 years 16 days ago

Download mi.eng.cam.ac.uk

Conversational speech exhibits considerable pronunciation variability, which has been shown to have a detrimental effect on the accuracy of automatic speech recognition. There have been many attempts to model pronunciation variation, including the use of decision-trees to generate alternate word pronunciations from phonemic baseforms. Use of such pronunciation models during recognition is known to improve accuracy. This paper describes the use of such pronunciation models during acoustic model training. Subtle difficulties in the straightforward use of alternatives to canonical pronunciations are first illustrated: it is shown that simply improving the accuracy of the phonetic transcription used for acoustic model training is of little benefit. Analysis of this paradox leads to a new method of accommodating nonstandard pronunciations: rather than allowing a phoneme in the canonical pronunciation to be realized as one of a few distinct alternate phones predicted by the pronunciation mo...

Murat Saraclar, Harriet J. Nock, Sanjeev Khudanpur

Real-time Traffic

Acoustic Model Training | Automated Reasoning | CSL 2000 | Model | Pronunciation Model |

claim paper

Post Info
More Details (n/a)

Added	17 Dec 2010
Updated	17 Dec 2010
Type	Journal
Year	2000
Where	CSL
Authors	Murat Saraclar, Harriet J. Nock, Sanjeev Khudanpur

Comments (0)

Sciweavers

Pronunciation modeling by sharing Gaussian densities across phonetic models

Acoustic Model Training | Automated Reasoning | CSL 2000 | Model | Pronunciation Model |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers