Multi-camera networks bring in potentials for a variety of vision-based applications through provisioning of rich visual information. In this paper a method of image segmentation for human gesture analysis in multi-camera networks is presented. Aiming to employ manifold sources of visual information provided by the network, an opportunistic fusion framework is described and incorporated in the proposed method for gesture analysis. A 3D human body model is employed as the converging point of spatiotemporal and feature fusion. It maintains both geometric parameters of the human posture and the adaptively learned appearance attributes, all of which are updated from the three dimensions of space, time and features of the opportunistic fusion. In sufficient confidence levels parameters of the 3D human body model are again used as feedback to aid subsequent vision analysis. The 3D human body model also serves as an intermediate level for gesture interpretation in different applications. Th...
Chen Wu, Hamid K. Aghajan