Abstract. We describe a visual communication application for a dark, theaterlike interactive virtual simulation training environment. Our system visually estimates and tracks the body position, orientation and the arm-pointing direction of the trainee. This system uses a near-IR camera array to capture images of the trainee from different angles in the dim-lighted theater. Image features like silhouettes and intermediate silhouette body axis points are then segmented and extracted from image backgrounds. 3D body shape information such as 3D body skeleton points and visual hulls can be reconstructed from these 2D features in multiple calibrated images. We proposed a particle-filtering based method that fits an articulated body model to the observed image features. Currently we focus on the arm-pointing gesture of either limb. From the fitted articulated model we can derive the position on the screen the user is pointing to. We use current graphic hardware to accelerate the processing...