In the last decade, consumer graphics cards have increased their power because of the computer games industry. These cards are now programmable and capable of processing huge amounts of data in a SIMD fashion. In this work, we propose an alternative implementation of a very intuitive and well known 2D template matching, where the most computationally expensive task is accomplished by the graphics hardware processor. This computation approach is not new, but in this work we resume the method step-by-step to better understand the underlying complexity. Experimental results show an extraordinary performance trade-off, even working with obsolete hardware.
Raúl Cabido, Antonio S. Montemayor, Á