We present a patch-based algorithm for the purpose of object classification in video surveillance. Within detected regions-of-interest (ROIs) of moving objects in the scene, a feature vector is calculated based on template matching of a large set of image patches. Instead of matching direct image pixels, we use Gabor-filtered versions of the input image at several scales. This approach has been adopted from recent experiments in generic object-recognition tasks. We present results for a new typical video surveillance dataset containing over 9,000 object images. Furthermore, we compare our system performance with another existing smaller surveillance dataset. We have found that with 50 training samples or higher, our detection rate is on the average above 95%. Because of the inherent scalability of the algorithm, an embedded system implementation is well within reach.
Rob G. J. Wijnhoven, Peter H. N. de With