Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clu...
Duplicate detection is the problem of detecting different entries in a data source representing the same real-world entity. While research abounds in the realm of duplicate detect...
Abstract—The permanent growth of personal music collections caused by the ongoing digital revolution asks for novel ways of organization. Traditional list based approaches are—...
Lukas Bossard, Michael Kuhn 0002, Roger Wattenhofe...
This paper proposes a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a dat...
Internet videos have grown exponentially with the help from video sharing websites. Automatic topic mining is therefore increasingly important for organizing and navigating such l...
Lu Liu, Yong Rui, Lifeng Sun, Bo Yang, Jianwei Zha...