This paper reports a document retrieval technique that retrieves machine-printed Latin-based document images through word shape coding. Adopting the idea of image annotation, a wo...
Abstract. XML documents are increasingly being used to mark up various kinds of data from web content to scientific data. Often these documents need to be collaboratively created a...
Background: The BioCreative text mining evaluation investigated the application of text mining methods to the task of automatically extracting information from text in biomedical ...
In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps Coarse text ...
Jayant Kumar, Wael Abd-Almageed, Le Kang, David S....
Abstract-- Text categorization is the task of assigning predefined categories to natural language text. With the widely used `bag of words' representation, previous researches...