We present a discriminative latent topic model for scene recognition. The capacity of our model is originated from the modeling of two types of visual contexts, i.e., the category specific global spatial layout of different scene elements, and the reinforcement of the visual coherence in uniform local regions. In contrast, most previous methods for scene recognition either only modeled one of these two visual contexts, or just totally ignored both of them. We cast these two coupled visual contexts in a discriminative Latent Dirichlet Allocation framework, namely context aware topic model. Then scene recognition is achieved by Bayesian inference given a target image. Our experiments on several scene recognition benchmarks clearly demonstrated the advantages of the proposed model.