We propose a novel approach for modelling correlations
between activities in a busy public space captured by multiple
non-overlapping and uncalibrated cameras. In our approach,
each camera view is automatically decomposed into
semantic regions, across which different spatio-temporal
activity patterns are observed. A novel Cross Canonical
Correlation Analysis (xCCA) framework is formulated to
detect and quantify temporal and causal relationships between
regional activities within and across camera views.
The approach accomplishes three tasks: (1) estimate the
spatial and temporal topology of the camera network; (2)
facilitate more robust and accurate person re-identification;
(3) perform global activity modelling and video temporal
segmentation by linking visual evidence collected across
camera views. Our approach differs from the state of the art
in that it does not rely on either intra or inter camera tracking.
It therefore can be applied to even the most challenging
...