Sciweavers

CVPR
2012
IEEE

Image categorization using Fisher kernels of non-iid image models

12 years 3 months ago
Image categorization using Fisher kernels of non-iid image models
The bag-of-words (BoW) model treats images as an unordered set of local regions and represents them by visual word histograms. Implicitly, regions are assumed to be identically and independently distributed (iid), which is a poor assumption from a modeling perspective. We introduce non-iid models by treating the parameters of BoW models as latent variables which are integrated out, rendering all local regions dependent. Using the Fisher kernel we encode an image by the gradient of the data log-likelihood w.r.t. hyper-parameters that control priors on the model parameters. Our representation naturally involves discounting transformations similar to taking square-roots, providing an explanation of why such transformations have proven successful. Using variational inference we extend the basic model to include Gaussian mixtures over local descriptors, and latent topic models to capture the co-occurrence structure of visual words, both improving performance. Our models yield state-of-the-...
Ramazan Gokberk Cinbis, Jakob J. Verbeek, Cordelia
Added 28 Sep 2012
Updated 28 Sep 2012
Type Journal
Year 2012
Where CVPR
Authors Ramazan Gokberk Cinbis, Jakob J. Verbeek, Cordelia Schmid
Comments (0)