We present a simple framework to model contextual
relationships between visual concepts. The new framework
combines ideas from previous object-centric methods
(which model contextual relationships between objects in
an image, such as their co-occurrence patterns) and scenecentric
methods (which learn a holistic context model from
the entire image, known as its “gist”). This is accomplished
without demarcating individual concepts or regions in the
image. First, using the output of a generic appearance
based concept detection system, a semantic space is formulated,
where each axis represents a semantic feature. Next,
context models are learned for each of the concepts in the
semantic space, using mixtures of Dirichlet distributions.
Finally, an image is represented as a vector of posterior
concept probabilities under these contextual concept models.
It is shown that these posterior probabilities are remarkably
noise-free, and an effective model of the contextual
rela...