We present a supervised binary encoding scheme for image retrieval that learns projections by taking into account similarity between classes obtained from output embeddings. Our motivation is that binary hash codes learned in this way improve the visual quality of retrieval results by ranking related (or “sibling”) class images before unrelated class images. We employ a sequential greedy optimization that learns relationship aware projections by minimizing the difference between inner products of binary codes and output embedding vectors. We develop a joint optimization framework to learn projections which improve the accuracy of supervised hashing over the current state of the art with respect to standard and sibling evaluation metrics. We further obtain discriminative features learned from correlations of kernelized input CNN features and output embeddings, which significantly boosts performance. Experiments are performed on three datasets: CUB-2011, SUN-Attribute and ImageNet ...
Sravanthi Bondugula, Varun Manjunatha, Larry S. Da