We propose a novel approach to understanding
activities from their partial observations monitored through
multiple non-overlapping cameras separated by unknown time
gaps. In our approach, each camera view is first decomposed
automatically into regions based on the correlation of object
dynamics across different spatial locations in all camera
views. A new Cross Canonical Correlation Analysis (xCCA)
is then formulated to discover and quantify the time delayed
correlations of regional activities observed within and across
multiple camera views in a single common reference space.
We show that learning the time delayed activity correlations
offers important contextual information for (i) spatial and
temporal topology inference of a camera network; (ii) robust
person re-identification and (iii) global activity interpretation
and video temporal segmentation. Crucially, in contrast
to conventional methods, our approach does not rely on either
intra-camera or inter-camera objec...