The task of the knowledge-based presentation system WIP is the generation of a variety of multimodal documents from an input consisting of a formal description of the communicative intent of a planned presentation. WIP generates illustrated texts that are customized for the intended audience and situation. We present the architecture of WIP and introduce as its major components the presentation planner, the layout manager, the text generator and the graphics generator. An extended notion of coherence for multimodal documents is introduced that can be used to constrain the presentation planning process. The paper focuses on the coordination of contents planning and layout that is necessary to produce a coherent illustrated text. In particular, we discuss layout revisions after contents planning and the influence of layout constraints on text generation. We show that in WIP the design of a multimodal document is viewed as a non-monotonic planning process that includes various revisions ...