Automatic Language Identification (LID) in music has received significantly less attention than LID in speech. Here, we study the problem of LID in music videos uploaded on YouTube. We use a “bag-of-words” approach based on stateof-the-art content based audio-visual features and linear SVM classifiers for automatic LID. Our system obtains 48% accuracy for a corpus of 25000 music videos and 25 different languages.
Vijay Chandrasekhar, Mehmet Emre Sargin, David A.