Abstract. A novel approach to create a general vision system is presented. The proposed method is based on a visual grammar representation which is transformed to a Bayesian network which is used for object recognition. We use a symbol-relational grammar for a hierarchical description of objects, incorporating spatial relations. The structure of a Bayesian network is obtained automatically form the grammar, and its parameters are learned from examples. The method is illustrated with two examples for face recognition.