Example-Based Logical Labeling of Document Title Page Images

15 years 8 months ago

Download www.dfki.uni-kl.de

This paper presents a ﬂexible and effective examplebased approach for labeling title pages which can be used for automated extraction of bibliographic data. The labels of interest are “Title”, “Author”, “Abstract” and “Afﬁliation”. The method takes a set of labeled document layouts and a single unlabeled document layout as input and ﬁnds the best matching layout in the set. The labels of this layout are used to label the new layout. The similarity measure for layouts combines structural layout similarity and textural similarity on the block-level. Experimental results yield accuracy rates from 94.8% to 99.6% obtained on the publicly available MARG dataset. This shows that our lightweight method has equivalent and partially better performance when compared to other more complex labeling methods known from the literature.

Joost van Beusekom, Daniel Keysers, Faisal Shafait

Real-time Traffic

Document Analysis | Document Layout | ICDAR 2007 | Structural Layout Similarity | Unlabeled Document Layout |

claim paper

» Simultaneous Layout Style and Logical Entity Recognition in a Heterogeneous Collection of ...

» A Statistical Learning Approach To Document Image Analysis

Post Info
More Details (n/a)

Added	19 Oct 2010
Updated	19 Oct 2010
Type	Conference
Year	2007
Where	ICDAR
Authors	Joost van Beusekom, Daniel Keysers, Faisal Shafait, Thomas M. Breuel

Comments (0)

Sciweavers

Example-Based Logical Labeling of Document Title Page Images

Document Analysis | Document Layout | ICDAR 2007 | Structural Layout Similarity | Unlabeled Document Layout |

Explore & Download

Productivity Tools

Sciweavers