Source separation techniques like independent component analysis and the more recent non-negative matrix factorization are gaining widespread use for the monaural separation of individual tracks present in a music sample. The underlying principle behind these approaches characterises only stationary signals and fails to separate nonstationary sources like speech or vocals. In this paper, we make an attempt to solve this problem and propose solutions to the extraction of vocal tracks from polyphonic audio recordings. We also present techniques to identify vocal sections in a music sample and design a classifier to perform a vocal–nonvocal segmentation task. Finally, we describe an application wherein we try to extract the melody from the separated vocal track using existing monophonic transcription techniques. The experimental work leads us to the conclusion that the quality of vocal source separation, albeit satisfactory, is not sufficient enough for further F0 analysis to extract...