Multimodal content-based structure analysis of karaoke music

14 years 9 months ago

Download www.ee.columbia.edu

This paper presents a novel approach for content-based analysis of karaoke music, which utilizes multimodal contents including synchronized lyrics text from the video channel and original singing audio as well as accompaniment audio in the two audio channels. We proposed a novel video text extraction technique to accurately segment the bitmaps of lyrics text from the video frames and track the time of its color changes that are synchronized to the music. A technique that characterizes the original singing voice by analyzing the volume balance between the two audio channels is also proposed. A novel music structure analysis method using lyrics text and audio content is then proposed to precisely identify the verses and choruses of a song, and segment the lyrics into singing phrases. Experimental results based on 20 karaoke music titles of difference languages have shown that our proposed video text extraction technique can detect and segment the lyrics texts with accuracy higher than 9...

Yongwei Zhu, Kai Chen, Qibin Sun

Real-time Traffic

Lyrics Texts | MM 2005 | Text Extraction Technique | Video Text Extraction |

claim paper

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	MM
Authors	Yongwei Zhu, Kai Chen, Qibin Sun

Comments (0)

Sciweavers

Multimodal content-based structure analysis of karaoke music

Lyrics Texts | MM 2005 | Text Extraction Technique | Video Text Extraction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers