This paper introduces a simple and efficient representation for natural images. We partition an image into blocks and treat the blocks as vectors in a high-dimensional space. We then fit a piece-wise linear model (i.e. a union of affine subspaces) to the vectors at each down-sampling scale. We call this a multi-scale hybrid linear model of the image. The hybrid and hierarchical structure of this model allows us effectively to extract and exploit multi-modal correlations among the imagery data at different scales. It conceptually and computationally remedies limitations of many existing image representation methods that are based on either a fixed linear transformation (e.g. DCT, wavelets), an adaptive uni-modal linear transformation (e.g. PCA), or a multi-modal model at a single scale. We will justify both analytically and experimentally why and how such a simple multi-scale hybrid model is able to reduce simultaneously the model complexity and computational cost. Despite a small over...