Currently, bandwidth limitations pose a major challenge for delivering high-quality multimedia information over the Internet to users. In this research, we aim to provide a better compression of presentation videos (e.g., lectures). The approach is based on the idea that people tend to pay more attention to the face and gesturing hands, and therefore these regions are given more resolution than the remaining image. Our method first detects and tracks the face and hand regions using color-based segmentation and Kalman filtering. Next, different classes of natural hand gesture are recognized from the hand trajectories by identifying gesture holds, position/velocity changes, and repetitive movements. The detected face/ hand regions and gesture events in the video are then encoded at higher resolution than the remaining lower-resolution background. We present results of the tracking and gesture recognition approach, and evaluate and compare videos compressed with the proposed method to un...
Robin Tan, James W. Davis