We propose an approach to restore severely degraded
document images using a probabilistic context model. Un-
like traditional approaches that use previously learned
prior models to restore an image, we are able to learn the
text model from the degraded document itself, making the
approach independent of script, font, style, etc. We model
the contextual relationship using an MRF. The ability to
work with larger patch sizes allows us to deal with severe
degradations including cuts, blobs, merges and vandalized
documents. Our approach can also integrate document
restoration and super-resolution into a single framework,
thus directly generating high quality images from degraded
documents. Experimental results show significant improve-
ment in image quality on document images collected from
various sources including magazines and books, and com-
prehensively demonstrate the robustness and adaptability of
the approach. It works well with document collections such
as books, e...
Jyotirmoy Banerjee, Anoop M. Namboodiri, C. V. Jaw