We propose an approach that parses registered images captured at ground level into architectural units for large-scale city modeling. Each parsed unit has a regularized shape, which can be used for further modeling purposes. In our approach, we first parse the environment into buildings, the ground, and the sky using a joint 2D-3D segmentation method. Then, we partition buildings into individual fac