Mean shift clustering is a powerful unsupervised data
analysis technique which does not require prior knowledge
of the number of clusters, and does not constrain the shape
of the clusters. The data association criteria is based on the
underlying probability distribution of the data points which
is defined in advance via the employed distance metric. In
many problem domains, the initially designed distance metric
fails to resolve the ambiguities in the clustering process.
We present a novel semi-supervised kernel mean shift algorithm
where the inherent structure of the data points is
learned with a few user supplied constraints in addition to
the original metric. The constraints we consider are the
pairs of points that should be clustered together. The data
points are implicitly mapped to a higher dimensional space
induced by the kernel function where the constraints can be
effectively enforced. The mode seeking is then performed on
the embedded space and the approach pr...