In this paper, we develop a system to track and recognize hand motion in nearly real time. An important application of this system is to simulate mouse as a visual inputting device. Tracking approach is based on Condensation algorithm, and active shape model. Our contribution is combining multi-modal templates to increase the tracking performance. Weighting value is given to the sampling ratio of Condensation by applying the prior property of the templates. The recognition approach is based on HMM. Experiments show our system is very promising to work as an auxiliary inputting device.