This paper addresses the problem of object detection and recognition in complex scenes, where objects are partially occluded. The approach presented herein is based on the hypothesis that a careful analysis of visible object details at various scales is critical for recognition in such settings. In general, however, computational complexity becomes prohibitive when trying to analyze multiple sub-parts of multiple objects in an image. To alleviate this problem, we propose a generative-model framework – namely, dynamic tree-structure belief networks (DTSBNs). This framework formulates object detection and recognition as inference of DTSBN structure and image-class conditional distributions, given an image. The causal (Markovian) dependencies in DTSBNs allow for design of computationally efficient inference, as well as for interpretation of the estimated structure as follows: each root represents a whole distinct object, while children nodes down the sub-tree represent parts of that o...
Sinisa Todorovic, Michael C. Nechyba