We present a novel representation and method for detecting and explaining anomalous activities in a video stream. Drawing from natural language processing, we introduce a representation of activities as bags of event n-grams, where we analyze the global structural information of activities using their local event statistics. We demonstrate how maximal cliques in an undirected edge-weighted graph of activities, can be used in an unsupervised manner, to discover regular sub-classes of an activity class. Based on these discovered sub-classes, we formulate a definition of anomalous activities and present a way to detect them. Finally, we characterize each discovered sub-class in terms of its "most representative member," and present an informationtheoretic method to explain the detected anomalies in a human-interpretable form.
Raffay Hamid, Amos Y. Johnson, Samir Batta, Aaron