In this article, we introduce a new problem: the construction of multi-structured documents. We first offer an overview of existing solutions to the representation of such docum...
A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, ...
In this paper, we describe a system to perform Document Image Retrieval in Digital Libraries. The system allows users to retrieve digitized pages on the basis of layout similaritie...
We present methods for eliminating or reducing the distortion in a scanned image. Aspects of the present paper allow for the automatic pruning, de-skewing, and unwarping of an ima...
This paper concerns the document multi-structuring issue. For various use objectives, many distinct structures may be defined simultaneously for the same original document. For ex...
Noureddine Chatti, Sylvie Calabretto, Jean-Marie P...