This paper presents an approach to extracting and using semantic layers from low altitude aerial videos for scene understanding and object tracking. The input video is captured by low flying aerial platforms and typically consists of strong parallax from non-ground-plane structures. A key aspect of our approach is the use of geo-registration of video frames to reference image databases (such as those available from Terraserver and Google satellite imagery) to establish a geo-spatial coordinate system for pixels in the video. Geo-registration enables Euclidean 3D reconstruction with absolute scale unlike traditional monocular structure from motion where continuous scale estimation over long periods of time is an issue. Geo-registration also enables correlation of video data to other stored information sources such as GIS (Geo-spatial Information System) databases. In addition to the geo-registration and 3D reconstruction aspects, the key contributions of this paper include: (1) exploit...
Jiangjian Xiao, Hui Cheng, Feng Han, Harpreet S. S