The capability of maintaining the pose of the mobile robot is central for basic navigation and map building tasks. In this paper we describe a vision-based hybrid localization scheme based on scale-invariant keypoints. In the first stage the topological localization is accomplished by matching the keypoints detected in the current view with the database of model views. Once the best match has been found, the relative pose between the model view and the current image is recovered. We demonstrate the efficiency of the location recognition approach and present a closed form solution to the relative pose recovery for the case of planar motion and unknown focal length of the camera. The approach is demonstrated on several examples of indoors environments.