One of the keys issues to content-based image retrieval is the similarity measurement of images. Images are represented as points in the space of low-level visual features and most similarity measures are based on certain distance measurement between these features. Given a distance metric, two images with shorter distance are deemed to more similar than images that are far away. The well-known problem with these similarity measures is the semantic gap, namely two images separated by large distance could share the same semantic content. In this paper, we propose a novel similarity measure of images that goes beyond the distance measurement. The key idea is to exploit the clustering structure of images when a large number of images are present. The similarity of two images is determined not only by their Euclidean distance in the space of visual features but also by the likelihood for them to be clustered together, which is further estimated using a marginalized kernel. Our empirical s...
Feng Kang, Rong Jin, Steven C. H. Hoi