Abstract. In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the docume...
The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and ...
We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. sppc consists of a set of domainindependent shallow core...
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...