Speaker identification with distant microphone speech

14 years 3 months ago

Download www.cs.cmu.edu

The field of speaker identification has recently seen significant advancement, but improvements have tended to be benchmarked on near-field speech, ignoring the more realistic setting of far-field-instrumented speakers. In this work we present several findings on far-field speech from the MIXER5 Corpus, in the areas of feature extraction, speaker modeling, and multichannel score combination. First, we observe that minimum-variance distortionless response (MVDR) features outperform Mel-frequency cepstral coefficient (MFCC) features, and that fundamental frequency variation (FFV) features offer complimentary information to both MFCC and MVDR features. Second, we present evidence that factor analysis significantly improves system performance, compared to the more traditional GMM/UBM strategy. Third, we find that frame-based score competition significantly improves performance under mismatched conditions with multiple channels available.

Qin Jin, Runxin Li, Qian Yang, Kornel Laskowski, T

Real-time Traffic

Fundamental Frequency Variation | ICASSP 2010 | Mel-frequency Cepstral Coefficient | Multichannel Score Combination | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Qin Jin, Runxin Li, Qian Yang, Kornel Laskowski, Tanja Schultz

Comments (0)

Sciweavers

Speaker identification with distant microphone speech

Fundamental Frequency Variation | ICASSP 2010 | Mel-frequency Cepstral Coefficient | Multichannel Score Combination | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers