We present a human body motion tracking system for an interactive virtual simulation training environment. This system captures images using IR illumination and near-IR cameras to overcome limitations of a dimly lit environment. Features, such as silhouettes and medial axis of blobs are extracted from the images which lack much internal texture. We use a combination of a 2D ICP and particle filtering method to estimate the articulated body configuration of a trainee from image features. The method allows articulation of the arms at elbows and shoulders and of the body at the waist; this is a considerable improvement over previous such methods. Our system works in real-time and is robust to temporary errors in image acquisition or tracking. The system serves as part of a multi-modal user-input device for interactive simulation.