In this paper, we address stereo matching in the presence of a class of non-Lambertian effects, where image formation can be modeled as the additive superposition of layers at different depths. The presence of such effects makes it impossible for traditional stereo vision algorithms to recover depths using direct color matching-based methods. We develop several techniques to estimate both depths and colors of the component layers. Depth hypotheses are enumerated in pairs, one from each layer, in a nested plane sweep. For each pair of depth hypotheses, matching is accomplished using spatial-temporal differencing. We then use graph cut optimization to solve for the depths of both layers. This is followed by an iterative color update algorithm which we proved to be convergent. Our algorithm recovers depth and color estimates for both synthetic and real image sequences.