We address the problem of recognizing, in dynamic meetings in which people do not remain seated all the time, the visual focus of attention (VFOA) of seated people from their head pose and contextual activity cues. We propose a model that comprises the VFOA of a meeting participant as the hidden state, and his head pose as the observation. To account for the presence of moving visual targets due to the dynamic nature of the meeting, the locations of the visual targets are used as an input variables to the head pose observation model. Contextual information is introduced in the VFOA dynamics through a slide activity variable and speaking or visual activity variables that relate people's focus to the meeting activity context.
Sileye O. Ba, Hayley Hung, Jean-Marc Odobez