In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...
This paper reports some experiments in using SVG (Scalable Vector Graphics), rather than the browser default of (X)HTML/CSS, as a potential Web-based rendering technology, in an a...
Scanned document images are nowadays becoming available in increasingly higher resolutions. Meanwhile, the variations in image quality within typical document collections increase...
Iuliu Konya Konya, Christoph Seibert, Stefan Eicke...
This paper introduces a method for automatically partitioning richly-formatted electronic documents. An automatic partitioning system has many potential uses, but we focus here on ...
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering. Keywords Random Indexing, K-tree, Dimensionality Reduction, B-tree, Search T...
Christopher M. De Vries, Lance De Vine, Shlomo Gev...