Human Visual attention (HVA) is an important strategy to focus on specific information while observing and understanding visual stimuli. HVA involves making a series of fixations on select locations while performing tasks such as object recognition, scene understanding, etc. We present one of the first works that combines fixation information omated concept detectors to (i) infer abstract image semantics, and (ii) enhance performance of object detectors. We develop visual attention-based models that sample fixation distributions and fixation transition distributions begions-of-interest (ROI) to infer abstract semantics such as expressive faces and object-interactions (such as look, read, etc.). We also exploit eye-gaze information to deduce possible locations and scale of salient concepts to aid stateof-the-art detectors. We observe a 18% performance increase with over 80% reduction in computational time for the stateof-the-art object detector in [4]. Categories and Subject Descriptor...
Harish Katti, Subramanian Ramanathan, Mohan S. Kan