Object recognition is challenging due to high intra-class
variability caused, e.g., by articulation, viewpoint changes,
and partial occlusion. Successful methods need to strike a
balance between being flexible enough to model such variation
and discriminative enough to detect objects in cluttered,
real world scenes. Motivated by these challenges we
propose a latent conditional random field (CRF) based on
a flexible assembly of parts. By modeling part labels as
hidden nodes and developing an EM algorithm for learning
from class labels alone, this new approach enables the automatic
discovery of semantically meaningful object part representations.
To increase the flexibility and expressiveness
of the model, we learn the pairwise structure of the underlying
graphical model at the level of object part interactions.
Efficient gradient-based techniques are used to estimate the
structure of the domain of interest and carried forward to
the multi-label or object part case. Our e...