Sciweavers

10 search results - page 2 / 2
» Readability of scanned books in digital libraries
Sort
View
CIKM
2008
Springer
13 years 9 months ago
Automatic metadata generation for scanned scientific volumes
Large scale digitization projects have been conducted at the Internet Archive digital library to preserve cultural artifacts and to provide permanent access. The increasing amount...
Xiaonan Lu, Brewster Kahle
DIAL
2004
IEEE
170views Image Analysis» more  DIAL 2004»
13 years 11 months ago
A General System for the Retrieval of Document Images from Digital Libraries
Large collections of scanned documents (books and journals) are now available in Digital Libraries. The most common method for retrieving relevant information from these collectio...
Simone Marinai, Emanuele Marino, Francesca Cesarin...
BMCBI
2011
12 years 11 months ago
Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library
Background: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, an...
Roderic D. M. Page
CIKM
2011
Springer
12 years 7 months ago
Partial duplicate detection for large book collections
A framework is presented for discovering partial duplicates in large collections of scanned books with optical character recognition (OCR) errors. Each book in the collection is r...
Ismet Zeki Yalniz, Ethem F. Can, R. Manmatha
JIIS
2002
168views more  JIIS 2002»
13 years 7 months ago
Hidden Markov Models for Text Categorization in Multi-Page Documents
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...
Paolo Frasconi, Giovanni Soda, Alessandro Vullo