Abstract. In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoLSuD, Document Layout Substructure Discovery, ext...
Developing personalized applications for the ubiquitous Web assumes to provide different user interfaces addressing heterogeneous capabilities of device classes. Major problems are...
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original document layout structure. Xed mixes electronic extraction methods with state-...
A new hybrid page layout analysis algorithm is proposed, which uses bottom-up methods to form an initial data-type hypothesis and locate the tab-stops that were used when the page...
The organization of a document collection into meaningful groups is a fundamental issue in document management systems. The grouping can be carried out by performing a comparison ...
Stefano Ferilli, Teresa Maria Altomare Basile, Mar...