This paper describes the development of a polyphonic music retrieval system with the n-gram approach. Musical n-grams are constructed from polyphonic musical performances in MIDI using the pitch and rhythm dimensions of music. These are encoded using text characters enabling the musical words generated to be indexed with existing text search engines. The Lemur Toolkit was adapted for the development of a demonstrator system on a collection of around 10,000 polyphonic MIDI performances. The indexing, search and retrieval with musical n-grams and this toolkit have been extensively evaluated through a series of experimental work over the past three years, published elsewhere. We discuss how the system works internally and describe our proposal for enhancements to Lemur towards the indexing of ‘overlaying’ as opposed to indexing a ‘bag of terms’. This includes enhancements to the parser for a ‘polyphonicmusical word indexer’ to incorporate within document position information ...
Shyamala Doraisamy, Stefan M. Rüger