3D models of urban sites with geometry and facade textures are needed for many planning and visualization applications. Approximate 3D wireframe model can be derived from aerial images but detailed textures must be obtained from ground level images. Integrating such views with the 3D models is difficult as only small parts of buildings may be visible in a single view. We describe a method that uses two or three vanishing points, and three 3D to 2D line correspondences to estimate the rotational and translational parameters of the ground level cameras. The valid set of multiple combinations of 3D to 2D line pairs is chosen by a hypotheses generation and evaluation Some experimental results are presented.