Retrieving images from very large collections, using image content as a key, is becoming an important problem. Users prefer to ask for pictures using notions of content that are strongly oriented to the presence of abstractly de ned objects. Computer programs that implement these queries automatically are desirable, but are hard to build because conventional object recognition techniques from computer vision cannot recognize very general objects in very general contexts. This paper describes our approach to object recognition, which is structured around a sequence of increasingly specialized grouping activities that assemble coherent regions of image that can be shown to satisfy increasingly stringent constraints. The constraints that are satis ed provide a form of object classi cation in quite general contexts. This view of recognition is distinguished by: far richer involvement of early visual primitives, including color and texture; hierarchical grouping and learning strategies in t...
David A. Forsyth, Jitendra Malik, Thomas K. Leung,