Many documents are available to a computer only as images from paper. However, most natural language processing systems expect their input as character-coded text, which may be di...
This paper deals about text extraction from heterogeneous documents for categorizing documents and indexing tasks. The purpose of this work is to find similar text regions basing ...
Badreddine Khelifi, Nizar Zaghden, Adel M. Alimi, ...
Graphical components information extraction is a crucial step in the chart recognition and understanding process. However, existing methods of information extraction from chart im...
The extraction of textual content from colour documents of a graphical nature is a complicated task. The text can be rendered in any colour, size and orientation while the existen...
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devana...