This paper describes an XPath-based discourse analysis module for Spoken Dialogue Systems that allows the dialogue author to easily manipulate and query both the user input's semantic representation and the dialogue context using a simple and compact formalism. We show that, in managing the human-machine interaction, the discourse context and the dialogue history are effectively represented as Document Object Model (DOM) structures. DOM defines interfaces that dialogue scripts can use to dynamically access and update the content, the structure and the style of the documents. In general, this approach applies also to richer multimedia and multimodal interactions where the interpretation of the user input depends on a combination of input modalities. Categories & Subject Descriptors: I.2.1 [Application and Expert Systems]: Natural Language Interfaces - I.2.7 [Application and Expert Systems]: Speech Recognition and Synthesis General Terms: Design, Experimentation.