ABSTRACT. Photographic images annotation is a complex problem. Indeed, the visual characteristics of objects of a class vary with the considered instance and the shooting conditions. In this paper we proposed a visual characterization of object parts, called "Visual Phrase", robust to these variations. A Visual Phrase is a set of regions of interest built according to pre-difined criteria; a topological criterium was studied in this paper. An automatic annotation method is proposed based on our definition and characterization of Visual Phrases. An experiment on VOC2009 corpus is presented, and we show that the fusion of our method with a standard bag of visual words approach on full images provides better results than those obtained via the standard approach. MOTS-CL