Abstract. In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoLSuD, Document Layout Substructure Discovery, ext...
Summarization of text documents is increasingly important with the amount of data available on the Internet. The large majority of current approaches view documents as linear sequ...
With the growing significance of digital libraries and the Internet, more and more electronic texts become accessible to a wide and geographically disperse public. This requires a...
Ulrich Schiel, Ianna M. S. F. de Sousa, Edberto Fe...
We present in this paper a system for converting PDF legacy documents into structured XML format. This conversion system first extracts the different streams contained in PDF files...
To summarize is to reducein complexity, and hencein length, while retaining some of the essential qualities of the original. This paper focusses on document extracts, a particular...