This paper proposes an approach for object class localization which goes beyond bounding boxes, as it also determines the outline of the object. Unlike most current localization methods, our approach does not require any hypothesis parameter space to be defined. Instead, it directly generates, evaluates and clusters shape masks. Thus, the presented framework produces more informative results for object class localization. For example, it easily learns and detects possible object viewpoints and articulations, which are often well characterized by the object outline. We evaluate the proposed approach on the challenging natural-scene Graz-02 object classes dataset. The results demonstrate the extended localization capabilities of our method.