When designing computer vision systems for the blind and visually impaired it is important to determine the orientation of the user relative to the scene. We observe that most indoor and outdoor (city) scenes are designed on a Manhattan three-dimensional grid. This Manhattan grid structure puts strong constraints on the intensity gradients in the image. We demonstrate an algorithm for detecting the orientation of the user in such scenes based on Bayesian inference using statistics which we have learnt in this domain. Our algorithm requires a single input image and does not involve pre-processing stages such as edge detection and Hough grouping. We demonstrate strong experimental results on a range of indoor and outdoor images. We also show that estimating the grid structure makes it significantly easier to detect target objects which are not aligned with the grid. Proceedings International Conference on Computer Vision ICCV'99. Corfu, Greece. 1999.
James M. Coughlan, Alan L. Yuille