In classification tasks, class-modular strategy has been widely used. It has outperformed classical strategy for pattern classification task in many applications [1]. However, in ...
Double-sided manuscripts are often degraded by bleedthrough interference. Such degradation must be corrected to facilitate human perception and machine recognition. Most approache...
Automatic Term Recognition (ATR) is concerned with discovering terminology in large volumes of text corpora. Technical terms are vital elements for understanding the techniques us...
In this paper we present an adaptive method for graphic symbol representation based on shape contexts. The proposed descriptor is invariant under classical geometric transforms (r...
We discuss problems in developing policies for ground truthing document images for pixel-accurate segmentation. First, we describe ground truthing policies that apply to four diff...
Named Entity Recognition (NER) is an important subtask of document processing such as Information Extraction. This paper describes a NER algorithm which uses a Multi-Layer Percept...
In this paper we explore the effectiveness of three clustering methods used to perform word image indexing. The three methods are: the Self-Organazing Map (SOM), the Growing Hiera...
As a result of well-publicized security concerns with direct recording electronic (DRE) voting, there is a growing call for systems that employ some form of paper artifact to prov...
Daniel P. Lopresti, George Nagy, Elisa H. Barney S...
Certain forms of mathematical expression are used more often than others in practice. A quantitative understanding of actual usage can provide additional information to improve th...