This paper addresses the problem of fully automated
mining of public space video data. A novel Markov Clustering
Topic Model (MCTM) is introduced which builds on
existing Dynamic Bayesian Network models (e.g. HMMs)
and Bayesian topic models (e.g. Latent Dirichlet Allocation),
and overcomes their drawbacks on accuracy, robustness
and computational efficiency. Specifically, our model
profiles complex dynamic scenes by robustly clustering visual
events into activities and these activities into global
behaviours, and correlates behaviours over time. A collapsed
Gibbs sampler is derived for offline learning with
unlabeled training data, and significantly, a new approximation
to online Bayesian inference is formulated to enable
dynamic scene understanding and behaviour mining in new
video data online in real-time. The strength of this model
is demonstrated by unsupervised learning of dynamic scene
models, mining behaviours and detecting salient events in
three complex and cr...