In this paper, we propose a joint face orientation estimation in smart camera networks without having to localize the cameras in advance. The system is composed of in-node coarse estimation and joint refined estimation between cameras. In-node signal processing algorithms are designed intentionally simple and general to reduce computation required, yielding coarse estimates which may be erroneous. The proposed model-based technique determines the orientation and the angular motion of the face using two features, namely the hair-face ratio and the head optical flow. These features yield an estimate of the face orientation and the angular velocity through Least Squares (LS) analysis. In the joint refined estimation, a discrete-time linear dynamical system is first modeled. The spatiotemporal consistency between cameras is measured by a cost function, a weighted quadratic sum of spatial inconsistency, input energy, and in-node estimation error. Minimizing the cost function through Linear...
Chung-Ching Chang, Hamid K. Aghajan