In this paper we consider the problem of recovering the free space of an indoor scene from its single image. We show that exploiting the box like geometric structure of furniture and constraints provided by the scene, allows us to recover the extent of major furniture objects in 3D. Our “boxy” detector localizes box shaped objects oriented parallel to the scene across different scales and object types, and thus blocks out the occupied space in the scene. To localize the objects more accurately in 3D we introduce a set of specially designed features that capture the floor contact points of the objects. Image based metrics are not very indicative of performance in 3D. We make the first attempt to evaluate single view based occupancy estimates for 3D errors and propose several task driven performance measures towards it. On our dataset of 592 indoor images marked with full 3D geometry of the scene, we show that: (a) our detector works well using image based metrics; (b) our refine...
Varsha Hedau, Derek Hoiem, David A. Forsyth