— In this paper we present a multi-sensor fusion system for tracking people with a mobile robot, which integrates the information provided by a laser range sensor and a PTZ camera. We introduce the algorithms used for detecting legs from laser scans and faces from video images, then we illustrate a human motion model for the estimation of people position, orientation and height. The ego-motion of the robot is also taken into account and the information fused using an implementation of the Unscented Kalman Filter. Finally, multiple human tracks are generated and maintained thanks to an appropriate data association procedure. The results of several experiments are illustrated, proving the effectiveness of our approach, and some considerations drawn.