In this paper, we propose the combination of the Self Organizing Map (SOM) and of the tangent distance for effective clustering in Document Image Analysis. The proposed model (SOM TD) is used for character and layout clustering, with applications to word retrieval and to page classification. By using the tangent distance it is possible to improve the SOM clustering so as to be more tolerant with respect to small local transformations of the input patterns.