Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features

16 years 1 months ago

Download www.cse.salford.ac.uk

The recognition of script in historical documents requires suitable techniques in order to identify single words. Segmentation of lines and words is a challenging task because lines are not straight and words may intersect within and between lines. For correct word segmentation, the conventional analysis of distances between text objects needs to be supplemented by a second component predicting possible word boundaries based on semantical information. For date entries, hypotheses about potential boundaries are generated based on knowledge about the different variations as to how dates are written in the documents. It is modeled by distribution curves for potential boundary locations. Word boundaries are detected by classiﬁcation of local features, such as distances between adjacent text objects, together with location-based boundary distribution curves as a-priori knowledge. We applied the technique to date entries in historical church registers. Documents from the 18th and 19th cen...

Markus Feldbach, Klaus D. Tönnies

Real-time Traffic

Date Entries | Document Analysis | ICDAR 2003 | Possible Word Boundaries | Word Boundaries |

claim paper

Post Info
More Details (n/a)

Added	04 Jul 2010
Updated	04 Jul 2010
Type	Conference
Year	2003
Where	ICDAR
Authors	Markus Feldbach, Klaus D. Tönnies

Comments (0)

Sciweavers

Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features

Date Entries | Document Analysis | ICDAR 2003 | Possible Word Boundaries | Word Boundaries |

Explore & Download

Productivity Tools

Sciweavers