Many different approaches for content-based image retrieval have been proposed in the literature. Successful approaches consider not only simple features like color, but also take the structural relationship between objects into account. In this paper we describe two models for image representation which integrate structural features and content features in a tree or a graph structure. The effectiveness of this two approaches is evaluated with real world data, using clustering as means for evaluation. Furthermore, we show that combining those two models can further enhance the retrieval accuracy.