Global motion estimation is an important task for various video processing techniques. The estimation itself has to be robust in presence of arbitrarily moving foreground objects. For that task, two different kinds of estimation methods exist. On the one hand, pixel-based approaches deliver more precise results and work more robust on video sequences with foreground objects. On the other hand, when working on encoded video streams, block-based methods can be used for a much faster but often less precise estimation. We propose a two step estimation method based on the determination and tracking of feature points of video frames and robust motion model estimation using the Helmholtz principle. Therefore, good trackable features are detected and tracked in video sequences. Subsequently, a perspective motion model is derived from the resulting correspondencies by removing feature pairs not belonging to global motion.