Sciweavers

ACL
1998

Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model

14 years 26 days ago
Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model
We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OCR model, an approximate word matching method using character shape similarity, and a word segmentation algorithm using a statistical language model. By using a statistical OCR model and character shape similarity, the proposed error corrector outperforms the previously published method. When the baseline character recognition accuracy is 90%, it achieves 97.4% character recognition accuracy.
Masaaki Nagata
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 1998
Where ACL
Authors Masaaki Nagata
Comments (0)