Abstract. Identification of music instruments in polyphonic sounds is difficult and challenging, especially where heterogeneous harmonic partials are overlapping with each other. This has stimulated the research on sound separation for content-based automatic music information retrieval. Numerous successful approaches on musical data feature extraction and selection have been proposed for instrument recognition in monophonic sounds. Unfortunately, none of those algorithms can be successfully applied to polyphonic sounds. Based on recent successful researches in sound classification of monophonic sounds and studies in speech recognition, Moving Picture Experts Group (MPEG) standardized a set of features of the digital audio content data for the purpose of interpretation of the information meaning. Most of them are in a form of large matrix or vector of large size, which are not suitable for traditional data mining algorithms; while other features in smaller size are not sufficient f...
Xin Zhang, Zbigniew W. Ras