In multi-camera networks rich visual data is provided both spatially and temporally. In this paper a method of human posture estimation is described incorporating the concept of an opportunistic fusion framework aiming to employ manifold sources of visual information across space, time, and feature levels. One motivation for the proposed method is to reduce raw visual data in a single camera to elliptical parameterized segments for efficient communication between cameras. A 3D human body model is employed as the convergence point of spatiotemporal and feature fusion. It maintains both geometric parameters of the human posture and the adaptively learned appearance attributes, all of which are updated from the three dimensions of space, time and features of the opportunistic fusion. In sufficient confidence levels parameters of the 3D human body model are again used as feedback to aid subsequent in-node vision analysis. Color distribution registered in the model is used to initialize...
Chen Wu, Hamid K. Aghajan