A significant problem in scene interpretation is efficient bottom-up extraction and representation of salient features. In this paper, we address the problem of correlating salient motion at a spatio-temporal level and also across spatially separated regions since it is in the interactions that more sophisticated scene interpretation can be found. We show that it is possible to spatio-temporally locate and detect salient motion events and interactions in two contrasting scenarios using the same hierarchical co-occurrence framework. Thus generating a concise description of a dynamic scene from the sequence data alone. Results show it is possible to reduce a highly populated multi-dimensional co-occurrence matrix representing correlations between salient motion regions, to a one dimensional vector with clearly separable unusual activity. The results also show that the method inherently provides a quantifiable measure of the saliency of an interaction through its frequency of occurren...
Hayley S. Hung, Shaogang Gong