In this paper, we wish to build a high quality database of
images depicting scenes, along with their real-world threedimensional
(3D) coordinates. Such a database is useful
for a variety of applications, including training systems for
object detection and validation of 3D output. We build such
a database from images that have been annotated with only
the identity of objects and their spatial extent in images. Important
for this task is the recovery of geometric information
that is implicit in the object labels, such as qualtitative relationships
between objects (attachment, support, occlusion)
and quantitative ones (inferring camera parameters). We
describe a model that integrates cues extracted from the object
labels to infer the implicit geometric information. We
show that we are able to obtain high quality 3D information
by evaluating the proposed approach on a database
obtained with a laser range scanner. Finally, given the
database of 3D scenes, we show how it can ...
Antonio B. Torralba, Bryan C. Russell