Local feature methods suitable for image feature based object recognition and for the estimation of motion and structure are composed of two steps, namely the `where' and `what' steps. The `where' step (e.g., interest point detector) must select image points that are robustly localizable under common image deformations and whose neighborhoods are relatively informative. The `what' step (e.g., local feature extractor) then provides a representation of the image neighborhood that is semi-invariant to image deformations, but distinctive enough to provide model identification. We present a quantitative evaluation of both the `where' and the `what' steps for three recent local feature methods: a) phase-based local features [2], b) differential invariants [14], and c) the scale invariant feature transform (SIFT) [9]. Moreover, in order to make the phase-based approach more comparable to the other two approaches, we also introduce a new form of multi-scale inter...
Gustavo Carneiro, Allan D. Jepson