Compound (or mixed) document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, web-sites etc. Because of the very distinct nature of those two image classes (text/graphics vs. pictures), their compression invariably involves multiple compression systems and a region segmentation (classification) method. We review state-of-the-art technologies on the subject while focusing our attention on the mixed raster content (MRC) multi-layer approach. We also present new results on segmentation for MRC based on optimized rate-distortion-based block thresholding.
Ricardo L. de Queiroz