We present a method for the simultaneous detection and segmentation of people from static images. The proposed technique requires no manual segmentation during training, and exploits top-down and bottom-up processing within a single framework for both object localization and 2D shape estimation. First, the coarse shape of the object is learned from a simple training phase utilizing low-level edge features. Motivated by the observation that most object categories have regular shapes and closed boundaries, relations between these features are then exploited to derive mid-level cues, such as continuity and closure. A novel Markov random field defined on the edge features is presented that integrates the coarse shape information with our expectation that objects are likely to have boundaries that are regular and closed. The algorithm is evaluated on pedestrian datasets of varying difficulty, including a wide range of camera viewpoints, and person orientations. Quantitative results are pre...
Vinay Sharma, James W. Davis