In this paper is described a data-driven algorithm for the functionally correct spelling of MIDI pitch values in terms of Western musical notation. Input is in the form of MIDI files containing accurate pitch and rhythmic information with corresponding ground-truth spelling information for training and evaluation. The algorithm recovers harmonic information from the MIDI data and spells pitches according to their relation to the local tonic. The algorithm achieved 94.98% accuracy on the pitches that required accidentals in the local key and 99.686% overall. Voice-leading resolution was found to be the best feature of those used to infer the correct spelling. Also, this paper outlines great potential for improvement under this model.
Josh Stoddard, Christopher Raphael, Paul E. Utgoff