Text classification in the medical domain is a real world problem with wide applicability. This paper investigates extensively the effect of text representation approaches on the p...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
This paper describes an approach to attention based layout segmentation using general principles of the human visual perception to achieve this goal. The text is considered as tex...
In this paper, we propose a method of text retrieval from document images using a similarity measure based on an N-Gram algorithm. We directly extract image features instead of us...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...