In this paper we present a vision-based approach to mobile robot localization, that integrates an image retrieval system with Monte-Carlo localization. The image retrieval process is based on features that are invariant with respect to image translations, rotations, and limited scale. Since it furthermore uses local features, the system is robust against distortion and occlusions which is especially important in populated environments. By using the sample-based Monte-Carlo localization technique our robot is able to globally localize itself, to reliably keep track of its position, and to recover from localization failures. Both techniques are combined by extracting for each image a set of possible view-points using a two-dimensional map of the environment. Our technique has been implemented and tested extensively. We present several experiments demonstrating the reliability and robustness of our approach even in the context of dynamics in the environment and larger errors in the odome...