Voice conversion can be reduced to a problem to find a transformation function between the corresponding speech sequences of two speakers. Perhaps the most voice conversions meth...
Speech can be represented as a time/frequency distribution of energy using a multi-band filter bank. A Markov random field model, which takes into account the possible time asynch...
This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription ...
To understand a speaker’s turn of a conversation, one needs to segment it into intonational phrases, clean up any speech repairs that might have occurred, and identify discourse...
In this paper, we explore the use of a Gaussian posteriorgram based representation for unsupervised discovery of speech patterns. Compared with our previous work, the new approach...