Large quantities of documents in the Internet and digital libraries are simply scanned and archived in image format, many of which are packed in PDF files. The word search tool pr...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
As Adobe's Portable Document Format has exploded in popularity so too has the number PDF generators, and predictably the quality of generated PDF varies considerably. This pa...
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
: We present a novel approach to retrieve metadata to scholarly papers stored locally as PDF files. A fingerprint is produced from the PDF fulltext to query an online metadata repo...