Traditional approaches for content-based image querying typically compute a single signature for each image based on color histograms, texture, wavelet transforms etc., and return as the query result, images whose signatures are closest to the signature of the query image. Therefore, most traditional methods break down when images contain similar objects that are scaled differently or at different locations, or only certain regions of the image match. In this paper, we propose WALRUS (WAveLet-based Retrieval of User-specified Scenes), a novel similarity retrieval algorithm that is robust to scaling and translation of objects within an image. WALRUS employs a novel similarity model in which each image is first decomposed into its regions, and the similarity measure between a pair of images is then defined to be the fraction of the area of the two images covered by matching regions from the images. In order to extract regions for an image, WALRUS considers sliding windows of varying ...