Discovering common objects that appear frequently in a number of images is a challenging problem, due to (1) the appearance variations of the same common object and (2) the enormous computational cost involved in exploring the huge solution space, including the location, scale, and the number of common objects. We characterize each image as a collection of visual primitives and propose a novel bottomup approach to gradually prune local primitives to recover the whole common object. A multi-layer candidate pruning procedure is designed to accelerate the image data mining process. Our solution provides accurate localization of the common object, thus is able to crop the common objects despite their variations due to scale, view-point, lighting condition changes. Moreover, it can extract common objects even with few number of images. Experiments on challenging image and video datasets validate the effectiveness and efficiency of our method. Categories and Subject Descriptors H.3.3 [Infor...