We propose a novel unsupervised learning algorithm to extract the layout of an image by learning latent object-related aspects. Unlike traditional image segmentation algorithms that segment an image using feature similarity, our method is able to learn high-level object characteristics (aspects) from a large number of unlabelled images containing similar objects to facilitate image segmentation. Our method does not require human to annotate the training set and works without supervision. We use a graphical model to address the learning of aspects and layout extraction together. In particular, aspectfeature dependency from multiple images is learned via the Expectation-Maximization algorithm. We demonstrate that, by associating latent aspects to spatial structure, the proposed method achieves much better layout extraction results than using Probabilistic Latent Semantic Analysis.