Sciweavers

ISMIR
2005
Springer

Improving Content-Based Similarity Measures by Training a Collaborative Model

14 years 5 months ago
Improving Content-Based Similarity Measures by Training a Collaborative Model
We observed that for multimedia data – especially music - collaborative similarity measures perform much better than similarity measures derived from content-based sound features. Our observation is based on a large scale evaluation with >250,000,000 collaborative data points crawled from the web and >190,000 songs annotated with content-based sound feature sets. A song mentioned in a playlist is regarded as one collaborative data point. In this paper we present a novel approach to bridging the performance gap between collaborative and contentbased similarity measures. In the initial training phase a model vector for each song is computed, based on collaborative data. Each vector consists of 200 overlapping unlabelled 'genres' or song clusters. Instead of using explicit numerical voting, we use implicit user profile data as collaborative data source, which is, for example, available as purchase histories in many large scale ecommerce applications. After the training ...
Richard Stenzel, Thomas Kamps
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where ISMIR
Authors Richard Stenzel, Thomas Kamps
Comments (0)