Compared with the traditional interaction approaches, such as keyboard, mouse, pen, etc, vision based hand interaction is more natural and efficient. In this paper, we proposed a robust real-time hand gesture recognition method. In our method, firstly, a specific gesture is required to trigger the hand detection followed by tracking; then hand is segmented using motion and color cues; finally, in order to break the limitation of aspect ratio encountered in most of learning based hand gesture methods, the scale-space feature detection is integrated into gesture recognition. Applying the proposed method to navigation of image browsing, experimental results show that our method achieves satisfactory performance.