Template-driven HTML documents posses an implicit, fixed schema denoting concepts and their relationships in a hierarchical fashion. Discovering this schema remains a relatively ...
Saikat Mukherjee, Guizhen Yang, Wenfang Tan, I. V....
Proper display and accurate recognition of document images are often hampered by degradations caused by poor scanning or transmission conditions. We propose a method to enhance su...
In this paper we present a system that allows its use to build synthetic graphical documents for the performance evaluation of symbol recognition systems. The key contribution of ...
Mathieu Delalandre, Tony P. Pridmore, Ernest Valve...
Abstract. In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the docume...
A new text line location and separation algorithm for complex handwritten documents is proposed. The algorithm is based on the application of a fuzzy directional runlength. The pr...