We demonstrate that transformation-based learning can be used to correct noisy speech recognition transcripts in the lecture domain with an average word error rate reduction of 12...
One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In (Mikheev, 1996), a technique for fully unsupervised statistical acquis...
This paper describes an italic font recognition method using stroke pattern analysis on wavelet decomposed word images. The word images are extracted from scanned text documents c...
In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The pro...
Diana McCarthy, Rob Koeling, Julie Weeds, John A. ...
Abstract—According to characteristics of Mongolian wordformation, a method for removing inflectional suffixes from word images of the Mongolian Kanjur is proposed in this paper. ...