Despite the current practice of re-keying most documents placed in digital libraries, we continue to try to improve accuracy of automated recognition techniques for obtaining document image content. This task is made more difficult when the document in question has been rendered in letterpress, subjected to hundreds of years of the aging process and been microfilmed before scanning. We endeavored to leave intact a previously described document reconstruction technique, and to enhance the document image to bring the perceived production values up to a more modern standards in order to process a novel of historic importance: Don Quixote by Miguel de Cervantes Saavedra. Pre-processing of the page images before application of the reconstruction techniques were performed to accommodate early 17th century typography and low-quality scanned microfilm images. Though our technology easily outstripped the capabilities of commercial OCRs, it too was found lacking, at this stage of development,...
A. Lawrence Spitz