A spanned cell in a table is a single, complete unit that physically occupies multiple columns and/or multiple rows. Spanned cells are common in tables, and they are a significan...
With an aim to extract the structural information from the table of contents (TOC) to help develop digital document library the requirement of identifying/segmenting the TOC page ...
S. Mandal, S. P. Chowdhury, Amit Kumar Das, Bhabat...
The first steps towards bridging the paper-digital divide have been achieved with the development of a range of technologies that allow printed documents to be linked to digital c...
When reading a document, we intuitively have a first global approach in order to determine the whole structure, before reading parts in details. We propose to apply the same kind ...
Recognition and encoding of digitized historical documents is still a challenging and difficult task. A major problem is the occurrence of unknown glyphs and symbols which might n...