We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns. Most papers on unsupervised discovery of object models are concerned with object categories. In contrast, this paper aims at identifying and extracting regions corresponding to specific object instances, e.g., two different laptops in the laptop category. By focusing on specific instances, we enforce explicit constraints on geometric consistency (such as scale, orientation), and appearance consistency (such as color, texture and shape). Using multiple segmentations as the basic building block, our program processes a noisy “soup” of segments and extrac...