Sciweavers

ICASSP
2010
IEEE

Towards multi-speaker unsupervised speech pattern discovery

13 years 11 months ago
Towards multi-speaker unsupervised speech pattern discovery
In this paper, we explore the use of a Gaussian posteriorgram based representation for unsupervised discovery of speech patterns. Compared with our previous work, the new approach provides significant improvement towards speaker independence. The framework consists of three main procedures: a Gaussian posteriorgram generation procedure which learns an unsupervised Gaussian mixture model and labels each speech frame with a Gaussian posteriorgram representation; a segmental dynamic time warping procedure which locates pairs of similar sequences of Gaussian posteriorgram vectors; and a graph clustering procedure which groups similar sequences into clusters. We demonstrate the viability of using the posteriorgram approach to handle many talkers by finding clusters of words in the TIMIT corpus.
Yaodong Zhang, James R. Glass
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where ICASSP
Authors Yaodong Zhang, James R. Glass
Comments (0)