Complex scenes such as underground stations and malls are composed of static occlusion structures such as walls, entrances, columns, turnstiles and barriers. Unless this occlusion landscape is made explicit such structures can defeat the process of tracking individuals through the scene. This paper describes a method of generating the probability density functions for the depth of the scene at each pixel from a training set of detected blobs, i.e., observations of detected moving people. As the results are necessarily noisy, a regularization process is employed to recover the most self-consistent scene depth structure. An occlusion reasoning framework is proposed to enable object tracking methodologies to make effective use of the recovered depth.