—Inspired by the expectation-based perception of humans, a surprise-driven active vision system is proposed. This vision system not only considers spatial saliency of objects in the environment, but also investigates temporal novelty in the neighborhood. Surprise is defined as the difference of the saliency probability distributions of two consecutive input images, which is measured using Kullback-Leibler divergence. The high-speed gaze shift capability of the camera platform and the parallel computation with the aid of GPUs enable a real-time tracking of the surprising event.