In orderto realize their full potential, multimodalsystems need to support not just input from multiple modes, but also synchronized integration of modes. Johnston et al (1997) model this integration using a unification operation over typed feature structures. This is an effectivesolution for a broad class of systems, but limits multimodal utterancesto combinationsof a singlespoken phrase with a single gesture. We show how the unification-based approach can be scaled up to provide a full multimodal grammar formalism. In conjunction with a multidimensional chart parser, this approach supports integration of multipleelementsdistributedacross the spatial, temporal, and acoustic dimensions of multimodal interaction. Integration strategies are stated in a high level unificationbased rule formalismsupporting rapid prototypingand iterative development of multimodal systems.