Large quantities of documents in the Internet and digital libraries are simply scanned and archived in image format, many of which are packed in PDF files. The word search tool pr...
This work presents the application of a first-order logic incremental learning system, INTHELEX, to learn rules for the automatic identification of a wide range of significant docu...
Teresa Maria Altomare Basile, Stefano Ferilli, Nic...
This work provides algorithms and heuristics to index text documents by determining important topics in the documents. To index text documents, the work provides algorithms to gene...
In this paper, we present a compressed pattern matching method for searching user queried words in the CCITT Group 4 compressed document images, without decompressing. The feature...
A number of techniques have previously been proposed for effective thresholding of document images. In this paper two new thresholding techniques are proposed and compared against...
Graham Leedham, Yan Chen, Kalyan Takru, Joie Hadi ...