Sciweavers

ICDAR
2009
IEEE

A Realistic Dataset for Performance Evaluation of Document Layout Analysis

14 years 6 months ago
A Realistic Dataset for Performance Evaluation of Document Layout Analysis
† There is a significant need for a realistic dataset on which to evaluate layout analysis methods and examine their performance in detail. This paper presents a new dataset (and the methodology used to create it) based on a wide range of contemporary documents. Strong emphasis is placed on comprehensive and detailed representation of both complex and simple layouts, and on colour originals. In-depth information is recorded both at the page and region level. Ground truth is efficiently created using a new semi-automated tool and stored in a new comprehensive XML representation, the PAGE format. The dataset can be browsed and searched via a web-based front end to the underlying database and suitable subsets (relevant to specific evaluation goals) can be selected and downloaded.
Apostolos Antonacopoulos, David Bridson, Christos
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Where ICDAR
Authors Apostolos Antonacopoulos, David Bridson, Christos Papadopoulos, Stefan Pletschacher
Comments (0)