The goal of this paper is to give a short historical overview of multiple view vision and in particular the estimation of both camera geometry and scene models using only images as input. This problem (the structure and motion problem) can be seen as the mathematical inverse of the computer graphics problem. The basic structure and motion problems for different feature types are discussed as well as some recent methods that estimate not only scene geometry, but also radiance, irradiance and illumination properties.