Abstract-- In this paper, we present a clustering-based tracking algorithm for tracking people (e.g. hand, head, eyeball, body, and lips). It is always a challenging task to track people under complex environment, because such target often appears as a concave object or having apertures. In this case, many background areas are mixed into the tracking area which are difficult to be removed by modifying the shape of the search area during tracking. Our method becomes a robust tracking algorithm by applying the following four key ideas simultaneously: 1) Using a 5D feature vector to describe both the geometric feature "(x,y)" and color feature "(Y,U,V)" of each pixel uniformly. This description ensures our method to follow both the position and color changes simultaneously during tracking; 2) This algorithm realizes the robust tracking for objects with apertures by classifying the pixels, within the search area, into "target" and "background" with K...