A novel approach to the computation of an approximate estimate of spatial object pose from camera images is proposed. The method is based on a neural network that generates pose hypotheses in real time, which can be refined by registration or tracking systems. A modification of Kohonen’s self-organizing feature map is systematically trained with computer generated object views such that it responds to a preprocessed image with one or more sets of object orientation parameters. The key concepts proposed are representations of spatial orientation that result in continuous distance measures, and the choice of a fixed network topology that is best suited to the representation of 3-D orientation. Experimental results from both simulated and real images demonstrate that a pose estimate within the accuracy requirements can be found in more than 90% of all cases. The current implementation operates at near frame rate on real world images.
S. Winkler, Patrick Wunsch, Gerd Hirzinger