The key of overlapping structures or concurrent markup hierarchies in XML encodings of documents is that markup in one hierarchy is not necessarily well-formed with respect to the...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
: Business Process Re-engineering (BPR) is an area that requires a lot of technical documents and an important feature of a well-written document is a coherent narrative. Even thou...
Multimedia documents are of importance in several application areas, such as education, training, advertising and entertainment. Since multimedia documents may comprise continuous...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...