An approach for incremental learning of a 3D scene from a single static video camera is presented in this paper. In particular, we exploit the presence of casual people walking in the scene to infer relative depth, learn shadows, and segment the critical ground structure. Considering that this type of video data is so ubiquitous, this work provides an important step towards 3D scene analysis from single cameras in readily available ordinary videos and movies. On-line 3D scene learning, as presented here, is very important for applications such as scene analysis, foreground refinement, tracking, biometrics, automated camera collaboration, activity analysis, identification, and real-time computer-graphics applications. The main contributions of this work are then two-fold. First, we use the people in the scene to continuously learn and update the 3D scene parameters using an incremental robust (L1) error minimization. Secondly, models of shadows in the scene are learned using a statisti...
Diego Rother, Kedar A. Patwardhan, Guillermo Sapir