Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new method of feature extraction, viz., Teager Energy Based Mel Frequency Cepstral Coefficients (T-MFCC) is developed for identification of perceptually similar languages. Finally, an LID system is presented for Hindi and Urdu (perceptually similar Indian languages) to demonstrate effectiveness of newly proposed feature set with short discussion on experimental results. Keywords- Language identification, Teager Energy Operator (TEO), Mel cepstrum, polynomial classifier, discriminative training.
Hemant A. Patil, T. K. Basu