A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

9 years 3 months ago

Download users.ics.aalto.fi

We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reﬂect side-information that is not expressed in the feature vectors, directly. Relative comparisons can express structures at ﬁner level of detail than must-link (ML) and cannot-link (CL) constraints that are commonly used for semi-supervised clustering. Relative comparisons are particularly useful in settings where giving an ML or a CL constraint is diﬃcult because the granularity of the true clustering is unknown. Our main contribution is an eﬃcient algorithm for learning a kernel matrix using the log determinant divergence (a variant of the Bregman divergence) subject to a set of relative distance constraints. Given the learned kernel matrix, a clustering can be obtained by any suitable algorithm, such as kernel k-means. We show empirically that kernels found by our ...

Ehsan Amid, Aristides Gionis, Antti Ukkonen

Real-time Traffic