In this paper, we propose a novel approach for scene modeling. The proposed method is able to automatically discover the intermediate semantic concepts. We utilize Maximization of Mutual Information (MMI) co-clustering approach to discover clusters of semantic concepts, which we call intermediate concepts. Each intermediate concept corresponds to a cluster of visterms in the Bag of Visterms (BOV) paradigm for scene classification. MMI coclustering results in fewer but meaningful clusters. Unlike k-means which is used to cluster image patches based on their appearances in BOV, MMI co-clustering can group the visterms which are highly correlated to some concept. Unlike probabilistic Latent Semantic Analysis (pLSA), which can be considered as one-sided soft clustering, MMI coclustering simultaneously clusters visterms and images, so it is able to boost both clustering. In addition, the MMI coclustering is an unsupervised method. We have extensively tested our proposed approach on two cha...