Abstract. One major goal of human computer interfaces is to simplify the communication task. Traditionally, users have been restricted to the language of computers for this task. With the emerging of the graphical and multimodal interfaces the effort required for working with a computer is decreasing. However, the problem of communication is still present, and users continue caring about the communication task when they deal with a computer. Our work focuses on improving the communication between the human and the computer. This paper presents the foundations of a multimodal dialog model based on a modal logic, which integrates the speech and the action under the same framework.