We present a method for estimating the point of fixation of an air traffic controller from a low resolution video sequence. A geometric model of the head is used to estimate head orientation; head pose estimates are combined with a 3D model of the environment to compute the target of gaze. The head model is constructed from a small set of images. Two methods are considered: in the first, we treat the head as a textured object and ignore lighting effects; in the second, we jointly estimate the albedo of each facet of the head model, and the parameters of a simple lighting model. Because ground-truth data are unavailable, the absolute accuracy of the gaze estimates is unknown, but incorporation of the lighting model does appear to reduce the noise level. With either method, the results are sufficiently accurate to answer questions of operational interest, such as ”is the controller looking out the window.”
Xavier L. C. Brolly, Constantinos Stratelos, Jeffr