In this paper we propose a novel approach to the perceptual interpretation of building facades that combines shape grammars, supervised classification and random walks. Procedural modeling is used to model the geometric and the photometric variation of buildings. This is fused with visual classification techniques (randomized forests) that provide a crude probabilistic interpretation of the observation space in order to measure the appropriateness of a procedural generation with respect to the image. A random exploration of the grammar space is used to optimize the sequence of derivation rules towards a semantico-geometric interpretation of the observations. Experiments conducted on complex architecture facades with ground truth validate the approach.