Learning Combinations of Multiple Feature Representations for Music Emotion Prediction

10 years 2 months ago

Download www2.imm.dtu.dk

Music consists of several structures and patterns evolving through time which greatly inﬂuences the human decoding of higher-level cognitive aspects of music like the emotions expressed in music. For tasks, such as genre, tag and emotion recognition, these structures have often been identiﬁed and used as individual and non-temporal features and representations. In this work, we address the hypothesis whether using multiple temporal and non-temporal representations of different features is beneﬁcial for modeling music structure with the aim to predict the emotions expressed in music. We test this hypothesis by representing temporal and non-temporal structures using generative models of multiple audio features. The representations are used in a discriminative setting via the Product Probability Kernel and the Gaussian Process model enabling Multiple Kernel Learning, ﬁnding optimized combinations of both features and temporal/ non-temporal representations. We show the increased p...

Jens Madsen, Bjørn Sand Jensen, Jan Larsen

Real-time Traffic