Sciweavers

ICASSP
2011
IEEE

Speech synthesis using HMM based diphone inventory encoding for low-resource devices

13 years 4 months ago
Speech synthesis using HMM based diphone inventory encoding for low-resource devices
In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebook drawn from the Gaussian mean vectors of phoneme HMMs. There are two encoding/synthesis schemes, a speaker dependent and a speaker independent one. The advantage of the latter is the potential common use of the HM-models by a recognizer and a synthesizer. We describe the steps to encode the inventories as well as the acoustic synthesis using them. Using the proposed method a diphone inventory with 1175 units can be compressed down to 19 kB. We will show that the synthesis quality with HMM-encoded inventories matches the quality of synthesis with AMR- or SPEEX-encoded inventories at noticeably smaller inventory sizes.
Guntram Strecha, Matthias Wolff
Added 21 Aug 2011
Updated 21 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Guntram Strecha, Matthias Wolff
Comments (0)