Combining different and complementary object models promises to increase the robustness and generality of today’s computer vision algorithms. This paper introduces a new method for combining different object models by determining a configuration of the models which maximizes their mutual information. The combination scheme consequently creates a unified hypothesis from multiple object models “on the fly” without prior training. To validate the effectiveness of the proposed method, the approach is applied to the detection of faces combining the output of three different models.