A successful representation of objects in the literature is as a collection of patches, or parts, with a certain appearance and position. The relative locations of the different parts of an object are constrained by the geometry of the object. Going beyond the patches on a single object, consider a collection of images of a particular class of scenes containing multiple (recurring) objects. The parts belonging to different objects are not constrained by such a geometry. However the objects, arguably due to their semantic relationships, themselves demonstrate a pattern in their relative locations, which also propagates to their parts. Analyzing the interactions between the parts across the collection of images would reflect these patterns, and the parts can be grouped accordingly. These groupings are typically hierarchical. We introduce hSO: Hierarchical Semantics of Objects, which is learnt from a collection of images of a particular scene and captures this hierarchical grouping. We p...