In this paper, we revisit the problem of detecting the page numbers of a document. This work is motivated by a need for a generic method which applies on a large variety of docume...
In the field of computer analysis of document images, the problems of physical and logical layout analysis have been approached through a variety of heuristic, rule-based, and gr...
. When publishing documents on the web, the user needs to describe and classify her documents for the benefit of later retrieval and use. This paper presents an approach to semanti...
Document Object Modeling (DOM) is widely used approach for retrieving data from an XML document. If the size of the XML document is very large, however, using the DOM approach for...
Seung Min Kim, Suk I. Yoo, Eunji Hong, Tae Gwon Ki...
Patent document images maintained by the U.S. patent database have a specific format, in which figures and text descriptions are separated into different sections. This makes it...