Abstract: Document analysis and text mining techniques are used to preprocess documents in information retrieval systems, to extract concepts in ontology construction processes, an...
—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...
Hidden Markov Models (HMM) are probabilistic graphical models for interdependent classification. In this paper we experiment with different ways of combining the components of an ...
Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By in...
Anette Hulth, Jussi Karlgren, Anna Jonsson, Henrik...
Neural networks are a powerful technology for classification of visual inputs arising from documents. However, there is a confusing plethora of different neural network methods th...