Sciweavers

TASLP
2002

Automatic generation of subword units for speech recognition systems

13 years 11 months ago
Automatic generation of subword units for speech recognition systems
Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of the subword units and the accuracy of the dictionary. In current LVCSR systems, both these components are manually designed. While manually designed subword units generalize well, they may not be the optimal units of classification for the specific task or environment for which an LVCSR system is trained. Moreover, when human expertise is not available, it may not be possible to design good subword units manually. There is clearly a need for data-driven design of these LVCSR components. In this paper, we present a complete probabilistic formulation for the automatic design of subword units and dictionary, given only the acoustic data ...
Rita Singh, Bhiksha Raj, Richard M. Stern
Added 23 Dec 2010
Updated 23 Dec 2010
Type Journal
Year 2002
Where TASLP
Authors Rita Singh, Bhiksha Raj, Richard M. Stern
Comments (0)