A method is presented to recover 3D scene structure and camera motion from multiple images without the need for correspondence information. The problem is framed as finding the maximum likelihood structure and motion given only the 2D measurements, integrating over all possible assignments of 3D features to 2D measurements. This goal is achieved by means of an algorithm which iteratively refines a probability distributionover the set of all correspondence assignments. At each iteration a new structure from motion problem is solved, using as input a set of 'virtual measurements' derived from this probability distribution. The distribution needed can be efficiently obtained by Markov Chain Monte Carlo sampling. The approach is cast within the framework of Expectation-Maximization,which guarantees convergence to a local maximizer of the likelihood. The algorithm works well in practice, as will be demonstrated using results on several real image sequences.
Frank Dellaert, Steven M. Seitz, Charles E. Thorpe