Robust semantic labeling of image regions is a basic problem in representing and retrieving image/video content. We propose an SVM-MRF framework to model features and their spatial distributions, leading towards a "semantic" representation. Eigenfeatures of Gabor wavelet features and Gaussian mixture model are used for feature clustering. Since similar feature vectors in one cluster can come from several different semantic classes, SVM is applied to represent conditioned feature vector distributions within each cluster, and a Markov random field is used to model the spatial distributions of the semantic labels. A semantic layout representation is proposed to describe the semantics of the images. Experiments show that this method can improve semantic labeling and is useful in similarity search.
Lei Wang, B. S. Manjunath