A system that performs the tracking of a human head in 3D in real time is presented. The head shape is modeled by an ellipse with a trained color histogram of skin and hair samples. The color histogram is dynamically updated based on incoming image data in order to accommodate for varying illumination conditions. On the other hand, the size of the searched ellipse projected on the image is scaled depending on the depth information gathered from stereo vision. The strength of our method resides on the use of a predictive filter to fuse color and depth information, iteratively refining the location of the head in 3D and the parameters of the head color histogram.