In this work, we are concerned with the detection of multiple objects in an image. We demonstrate that typically applied objectives have the structure of a random field model, but that the energies resulting from non-maximal suppression terms lead to the maximization of a submodular function. This is in general a difficult problem to solve, which is made worse by the very large size of the output space. We make use of an optimal approximation result for this form of problem by employing a greedy algorithm that finds one detection at a time. We show that we can adopt a branch-and-bound strategy that efficiently explores the space of all subwindows to optimally detect single objects while incorporating pairwise energies resulting from previous detections. This leads to a series of inter-related branch-and-bound optimizations, which we characterize by several new theoretical results. We then show empirically that optimal branch-and-bound efficiency gains can be achieved by a simple stra...
Matthew B. Blaschko