This paper presents a method of image processing used in a mono-vision system in order to study semiotic gestures. We present a robust method to track the hands and face of a person performing gestural communication and the Signs’ language communication. A model of skin is used to compute the observation density as a skin colour distribution in the image. Three particle filter trackers are implemented, with re-sampling and annealed update steps to increase their robustness to occultation and high acceleration variations of body parts’. Evaluations of the trackers with and without these enhancements, show the improvement that they bring.