Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. In language...
In this paper, we address a unique problem in Chinese language processing and report on our study on extending a Chinese thesaurus with region-specific words, mostly from the fina...
: A comprehensive online unconstrained Chinese handwriting dataset, SCUT-COUCH2009, is introduced in this paper. As a revision of SCUT-COUCH2008 [1], the SCUT-COUCH2009 database co...
Lianwen Jin, Yan Gao, Gang Liu, Yunyang Li, Kai Di...
In this work1 we obtain robust category-based language models to be integrated into speech recognition systems. Deductive rules are used to select linguistic categories and to matc...
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing f...