This paper presents a novel Gaussianized vector representation for scene images by an unsupervised approach. First, each image is encoded as an ensemble of orderless bag of features, and then a global Gaussian Mixture Model (GMM) learned from all images is used to randomly distribute each feature into one Gaussian component by a multinomial trial. The parameters of the multinomial distribution are defined by the posteriors of the feature on all the Gaussian components. Finally, the normalized means of the features distributed in every Gaussian component are concatenated to form a supervector, which is a compact representation for each scene image. We prove that these super-vectors observe the standard normal distribution. Our experiments on scene categorization tasks using this vector representation show significantly improved performance compared with the bag-of-features representation.
Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang,