The concept of speaker recognition using i-vectors was recently introduced offering state-of-the-art performance. An i-vector is a compact representation of a speaker’s utterance after projection into a low-dimensional, total variability subspace trained using factor analysis. A secondary process involving linear discriminant analysis (LDA) is then used to improve the discrimination of i-vectors from different speakers. The newness of this technology invokes the question as to the best way to train the total variability subspace and LDA matrix when using speech collected from distinctly different sources. This paper presents a comparative study of a number of subspace training techniques and a novel source-normalisedand-weighted LDA algorithm for the purpose of improving i-vectorbased speaker recognition under mis-matched evaluation conditions. Results from the NIST 2010 speaker recognition evaluation (SRE) suggest that accounting for source conditions in the LDA matrix as opposed t...
Mitchell McLaren, David A. van Leeuwen