In order to overcome poor readability of text and recognizability of image features in low resolution thumbnails, a novel image representation of compound document images - a SmartNail representation - is presented. SmartNails are replacements or supplements to traditional thumbnails for compound documents and contain cropped and scaled image and text segments. Image-based analysis and text-based analysis are merged to generate a layout for a particular display size with selected readable text and recognizable image regions. The analysis is efficiently performed by using information from document layout analysis and JPEG 2000 compressed file headers.
Kathrin Berkner, Edward L. Schwartz, Christophe Ma