The 2.1D sketch is a layered representation of occluding and occluded surfaces of the scene. Extracting the 2.1D sketch from a single image is a difficult and important problem arising in many applications. We present a fast and robust algorithm that uses boundaries of image regions and T-junctions, as important visual cues about the scene structure, to estimate the scene layers. The estimation is a quadratic optimization with hinge-loss based constraints, so the 2.1D sketch is smooth in all image areas except on image contours, and image regions forming "stems" of the T-junctions correspond to occluded surfaces in the scene. Quantitative and qualitative results on challenging, real-world images--namely, Stanford depthmap and Berkeley segmentation dataset--demonstrate high accuracy, efficiency, and robustness of our approach.