We present an approach to vision-based person detection in robotic applications that integrates top down template matching with bottom up classifiers. We detect components of the human silhouette, such as torso and legs; this approach provides greater invariance than monolithic methods to the wide variety of poses a person can be in. We detect borders on each image, then apply a distance transform, and then match templates at different scales. This matching process generates a focus of attention (candidate people) that are later confirmed using a trained Support Vector Machine (SVM) classifier. Our results show that this method is both fast and precise and directly applicable in robotic architectures.
Carlos D. Castillo, Carolina Chang