Investigation of acoustic units for LVCSR systems

14 years 11 months ago

Download mirlab.org

One important issue in designing state-of-the-art LVCSR systems is the choice of acoustic units. Context dependent (CD) phones remain the dominant form of acoustic units. They can capture the co-articulatory effect in speech via explicit modelling. However, for other more complicated phonological processes, they rely on the implicit modelling ability of the underlying statistical models. Alternatively, it is possible to construct acoustic models based on higher level linguistic units, for example, syllables, to explicitly capture these complex patterns. When suf cient training data is available, this approach may show an advantage over implicit acoustic modelling. In this paper a wide range of acoustic units are investigated to improve LVCSR system performance. Signi cant error rate gains up to 7.1% relative (0.8% abs.) were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using word and syllable position dependent triphone and quinphone models.

Xunying Liu, Mark John Francis Gales, Jim L. Hiero

Real-time Traffic

Acoustic Units | ICASSP 2011 | Implicit Acoustic Modelling | Implicit Modelling Ability | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Xunying Liu, Mark John Francis Gales, Jim L. Hieronymus, Philip C. Woodland

Comments (0)

Sciweavers

Investigation of acoustic units for LVCSR systems

Acoustic Units | ICASSP 2011 | Implicit Acoustic Modelling | Implicit Modelling Ability | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers