Abstract— We propose a system for human computer interaction via 3D hand movements, based on a combination of visual tracking and a cheap, off-the-shelf, accelerometer. We use a 3D model and region based tracker, resulting in robustness to variations in illumination, motion blur and occlusions. At the same time the accelerometer allows us to deal with the multimodality in the silhouette to pose function. We synchronise the accelerometer and tracker online, by casting the calibration problem as a maximum covariance problem, which we then solve probabilistically. We show the effectiveness of our solution with multiple real-world tests and demonstration scenarios.