Abstract. Automated modeling of appropriate and valid document descriptions is a central issue for the benefit and success of an ontologybased personal document management system. One of the more practical problems is the deduction of knowledge from partly large but varying, ambiguous, or domain specific information sources (metadata, attributes, features, etc.). The generation process, which requires transformation and reasoning techniques, primarily depends on the application context and should be customized accordingly. Furthermore, automatically generated and deduced information needs appropriate cleaning and consolidation to maintain a certain level of data quality. Therefore, this paper presents a stepwise knowledge modeling approach based on consecutive stages and separated, configurable rule sets. Following the principle of divide-and-conquer, the suggested approach separately addresses the problems of general translation of diverse information sources, syntax check, normalizat...