Localized spectro-temporal cepstral analysis of speech

14 years 10 months ago

Download cbcl.mit.edu

Drawing on recent progress in auditory neuroscience, we present a novel speech feature analysis technique based on localized spectrotemporal cepstral analysis of speech. We proceed by extracting localized 2D patches from the spectrogram and project onto a 2D discrete cosine (2D-DCT) basis. For each time frame, a speech feature vector is then formed by concatenating low-order 2DDCT coefﬁcients from the set of corresponding patches. We argue that our framework has signiﬁcant advantages over standard onedimensional MFCC features. In particular, we ﬁnd that our features are more robust to noise, and better capture temporal modulations important for recognizing plosive sounds. We evaluate the performance of the proposed features on a TIMIT classiﬁcation task in clean, pink, and babble noise conditions, and show that our feature analysis outperforms traditional features based on MFCCs.

Jake V. Bouvrie, Tony Ezzat, Tomaso Poggio

Real-time Traffic

Feature Analysis | ICASSP 2008 | Signal Processing | Speech Feature | Speech Feature Analysis |

claim paper

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Jake V. Bouvrie, Tony Ezzat, Tomaso Poggio

Comments (0)

Sciweavers

Localized spectro-temporal cepstral analysis of speech

Feature Analysis | ICASSP 2008 | Signal Processing | Speech Feature | Speech Feature Analysis |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers