We seek a framework that addresses localization, detection and recognition of man-made objects in natural-scene images in a unified manner. We propose to model artificial structures by dynamic tree-structured belief networks (DTSBNs). DTSBNs provide for a distribution over tree structures that we learn using our Structured Approximation (SVA) inference algorithm. Furthermore, we propose multiscale linear-discriminant analysis (MLDA) as a feature extraction method, which appears well suited for our goals, as we assume that man-made objects are characterized primarily by geometric regularities and by patches of uniform color. MLDA extracts edges over a finite range of locations, orientations and scales, decomposing an image into dyadic squares. Both the color of dyadic squares and the geometric properties of extracted edges represent observable input to our DTSBNs. Experimental results demonstrate that DTSBNs, trained on MLDA features, offer a viable solution for detection of artificial...
Michael C. Nechyba, Sinisa Todorovic