Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-offeature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhoodlevel couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov Random Fields. The models can be trained u...
Jakob J. Verbeek, Bill Triggs