This paper presents a method for registering multimodal imagery in short range surveillance situations when the differences in object depths preclude any global registration techniques. An analysis of multimodal registration approaches gives insight into the limitations of global assumptions and motivates the developed algorithm. Using calibrated stereo imagery, we use maximization of mutual information in sliding correspondence windows that inform a disparity voting scheme to demonstrate successful registration of color and thermal images. Extensive testing of scenes with multiple people at different depths and levels of occlusion shows high rates of successful registration and gives a reliable framework for further processing and analysis of the multimodal imagery.
Stephen J. Krotosky, Mohan M. Trivedi