Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

160

DIAL
2004
IEEE

170views Image Analysis» more DIAL 2004»

Document Style Census for OCR

15 years 10 months ago

Document Style Census for OCR

Download www2.parc.com

Four methods of converting paper documents to computer-readable form are compared with regard to hypothetical labor cost: keyboarding, omnifont OCR, stylespecific OCR, and style-constrained or styleadaptive OCR. The best choice is determined primarily by (1) the reject rates of the various OCR systems at a given error rate, (2) the fraction of the material that must be labeled for training the system, and (3) the cost of partitioning the material according to style. For large corpora, sampling strategies are proposed both for estimating conversion costs and for taking advantage of style homogeneity.

George Nagy, Prateek Sarkar

Real-time Traffic

DIAL 2004 | Hypothetical Labor Cost | Image Analysis | Omnifont Ocr | Various Ocr Systems |

claim paper

Related Content

» Towards a Ptolemaic Model for OCR

» Keyword Spotting in Document Images through Word Shape Coding

» Recognition of Printed Amharic Documents

» On Separation of English Numerals from Multilingual Document Images

» Indexing and retrieval of words in old documents

» Character Recognition by Adaptive Statistical Similarity

» BLSTM Neural Network Based Word Retrieval for Hindi Documents

Post Info
More Details (n/a)

Added	20 Aug 2010
Updated	20 Aug 2010
Type	Conference
Year	2004
Where	DIAL
Authors	George Nagy, Prateek Sarkar

Comments (0)