We present a holistic statistical model for the automatic analysis of complex scenes. Here, holistic refers to an integrated approach that does not take local decisions about segmentation or object transformations. Starting from Bayes’ decision rule, we develop an appearancebased approach explaining all pixels in the given scene using an explicit background model. This allows the training of object references from unsegmented data and recognition of complex scenes. We present empirical results on different databases obtaining state-of-the-art results on two databases where a comparison to other methods is possible. To obtain quantifiable results for object-based recognition, we introduce a new database with subsets of different difficulties.