This paper proposes a system to estimate the 3D position and velocity of vehicles, from images acquired with a monocular camera. Given image regions where vehicles are detected, Gaussian distributions are estimated detailing the most probable 3D road regions where vehicles lay. This is done by combining an assumed image formation model with the Unscented Transform mechanism. These distributions are then fed into a Multiple Hypothesis Tracking algorithm, which constructs trajectories coherent with an assumed model of dynamics. This algorithm not only characterizes the dynamics of detected vehicles, but also discards false detections, as they do not find spatio-temporal support. The proposals is tested in synthetic sequences, evaluating how noisy observations and miss-detections affect the accuracy of recovered trajectories.
Daniel Ponsa, Antonio M. López