In this paper we present a novel system for driver-vehicle interaction which combines speech recognition with facialexpression recognition to increase intention recognition accuracy in the presence of engine- and road-noise. Our system would allow drivers to interact with in-car devices such as satellite navigation and other telematic or control systems. We describe a pilot study and experiment in which we tested the system, and show that multimodal fusion of speech and facial expression recognition provides higher accuracy than either would do alone. Categories and Subject Descriptors H5.2 [User Interfaces] General Terms Design, Human Factors. Keywords Driver monitoring, facial-expression recognition, speech recognition, multimodal inference.