Varying channel conditions present a difficult problem for many speech technologies such as language identification (LID). Channel compensation techniques have been shown to significantly improve performance in LID for acoustic systems [1]. For high-level token systems, nuisance attribute projection (NAP) has been shown to perform well in the context of speaker identification [2]. In this work, we describe a novel approach to dealing with the high dimensional sparse NAP training problem as applied to a 4-gram phonotactic LID system [3] run on the NIST 2009 Language Recognition Evaluation (LRE) [4] task. We demonstrate performance gains on the Voice of America (VOA) portion of the 2009 LRE data.
Fred S. Richardson, William M. Campbell