This work presents the application of a first-order logic incremental learning system, INTHELEX, to learn rules for the automatic identification of a wide range of significant document classes and their related components. Specifically, the material includes multi-format cultural heritage documents concerning European films from the 20's and 30's provided by the EU project COLLATE. Incrementality plays a key role when the set of documents is continuously augmented. To ensure that there is no performance loss with respect to classical one-step systems, a comparison with Progol was carried out. Experimental results prove that the proposed approach is a viable solution, for both its performance and its effectiveness in the document processing domain.