We propose a nonparametric framework based on the beta process for discovering temporal patterns within a heterogenous video collection. Starting from quantized local motion descriptors, we describe the long-range temporal dynamics of each video via transitions between a set of dynamical behaviors. Bayesian nonparametric statistical methods allow the number of such behaviors and the subset exhibited by each video to be learned without supervision. We extend the earlier beta process HMM in two ways: adding data-driven MCMC moves to improve inference on realistic datasets and allowing global sharing of behavior transition parameters. We illustrate discovery of intuitive and useful dynamical structure, at various temporal scales, from videos of simple exercises, recipe preparation, and Olympic sports. Segmentation and retrieval experiments show the benefits of our nonparametric approach.
Michael C. Hughes, Erik B. Sudderth