This paper addresses the image representation problem in visual sensor networks. We propose a new image representation scheme based on compressive sensing (CS) because compressive sensing is capable of reducing computational complexity of an image/video encoder. In our scheme, the encoder first decomposes the input image into two components, i.e., dense and sparse components; then the dense component is encoded by the traditional approach while the sparse component is encoded by a CS technique. To improve the rate distortion performance, we leverage the strong correlation between dense and sparse components. Given the measurements and the prediction of the sparse component, we use projection onto convex set (POCS) to reconstruct the sparse component. Our method considerably reduces the number of random measurements needed and decoding computational complexity, compared to the existing CS methods.