Abstract. Learning event models from videos has applications ranging from abnormal event detection to content based video retrieval. Relational learning techniques such as Inductive Logic Programming (ILP) hold promise for building such models, but have not been successfully applied to the very large datasets which result from video data. In this paper we present a novel supervised learning framework to learn event models from large video datasets( 2.5 million frames) using ILP. Efficiency is achieved via the learning from interpretations setting and using a typing system. This allows learning to take place in a reasonable time frame with reduced false positives. The experimental results on video data from an airport apron where events such as Loading, Unloading, Jet-Bridge Parking etc are learned suggests that the techniques are suitable to real world scenarios.
Krishna S. R. Dubba, Anthony G. Cohn, David C. Hog