Sciweavers

ICDAR
2011
IEEE

Aletheia - An Advanced Document Layout and Text Ground-Truthing System for Production Environments

12 years 11 months ago
Aletheia - An Advanced Document Layout and Text Ground-Truthing System for Production Environments
- Large-scale digitisation has led to a number of new possibilities with regard to adaptive and learning based methods in the field of Document Image Analysis and OCR. For ground truth production of large corpora, however, there is still a gap in terms of productivity. Ground truth is not only crucial for training and evaluation at the development stage of tools but also for quality assurance in the scope of production workflows for digital libraries. This paper describes Aletheia, an advanced system for accurate and yet cost-effective ground truthing of large amounts of documents. It aids the user with a number of automated and semi-automated tools which were partly developed and improved based on feedback from major libraries across Europe and from their digitisation service providers which are using the tool in a production environment. Novel features are, among others, the support of top-down ground truthing with sophisticated split and shrink tools as well as bottom-up ground trut...
C. Clausner, Stefan Pletschacher, Apostolos Antona
Added 24 Dec 2011
Updated 24 Dec 2011
Type Journal
Year 2011
Where ICDAR
Authors C. Clausner, Stefan Pletschacher, Apostolos Antonacopoulos
Comments (0)