As it becomes increasingly viable to capture, store, and share large amounts of image and video data, automatic image analysis is crucial to managing visual information. Many problems demand fast, accurate search of very large databases of images, but often the most effective metrics for image comparisons do not mesh well with known efficient search methods. Specifically, useful image representations use "structured" (non-vector) inputs that require specialized distance functions to compare, and the best use of side information complementing the visual data itself (e.g., partial annotations or tags) may require that a task-specific metric be learned. This paper overviews our research developing robust measures of image similarity intended to accommodate complex feature spaces and massive image databases. In particular, we overview efficient strategies for metrics that match local image features or learn from similarity constraints, and show how to perform sub-linear time sea...