Abstract. Tracking is usually interpreted as finding an object in single consecutive frames. Regularization is done by enforcing temporal smoothness of appearance, shape and motion. We propose a tracker, by interpreting the task of tracking as segmentation of a volume in 3D. Inherently temporal and spatial regularization is unified in a single regularization term. Segmentation is done by a variational approach using anisotropic weighted Total Variation (TV) regularization. The proposed convex energy is solved globally optimal by a fast primal-dual algorithm. Any image feature can be used in the segmentation cue of the proposed Mumford-Shah like data term. As a proof of concept we show experiments using a simple color-based appearance model. As demonstrated in the experiments, our tracking approach is able to handle large variations in shape and size, as well as partial and complete occlusions.