We introduce an exemplar model that can learn and generate a region of interest around class instances in a training set, given only a set of images containing the visual class. The model is scale and translation invariant. In the training phase, image regions that optimize an objective function are automatically located in the training images, without requiring any user annotation such as bounding boxes. The objective function measures visual similarity between training image pairs, using the spatial distribution of both appearance patches and edges. The optimization is initialized using discriminative features. The model enables the detection (localization) of multiple instances of the object class in test images, and can be used as a precursor to training other visual models that require bounding box annotation. The detection performance of the model is assessed on the PASCAL Visual Object Classes Challenge 2006 test set. For a number of object classes the performance far exceeds t...