In this article we show how an active stereo camera head can be made to autonomously learn to fixate objects in space. During fixation, the system performs an initial and a correction saccade. In the learning phase the correction saccade is controlled by a crude prewired algorithm, in analogy to a mechanism surmised to exist in the brainstem. A vector-based neural network serves as the adaptive component in our system. A self-organizing fovea improves dramatically the convergence of the learning algorithm and the accuracy of the fixation. As a possible application we describe the visuo-motor coordination of the camera head with an anthropomorphic robot arm.