Sciweavers

DAS
2010
Springer
14 years 2 months ago
Analysis and taxonomy of column header categories for web tables
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wa...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka...
DAS
2010
Springer
14 years 2 months ago
Improved classification through runoff elections
We consider the problem of dealing with irrelevant votes when a multi-case classifier is built from an ensemble of binary classifiers. We show how run-off elections can be used to...
Oleg Golubitsky, Stephen M. Watt
DAS
2010
Springer
14 years 3 months ago
Text extraction from graphical document images using sparse representation
A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-d...
Thai V. Hoang, Salvatore Tabbone
DAS
2010
Springer
14 years 3 months ago
A post-processing scheme for malayalam using statistical sub-character language models
Most of the Indian scripts do not have any robust commercial OCRs. Many of the laboratory prototypes report reasonable results at recognition/classification stage. However, word ...
Karthika Mohan, C. V. Jawahar
DAS
2010
Springer
14 years 3 months ago
A kernel-based approach to document retrieval
In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain ...
Albert Gordo, Jaume Gibert, Ernest Valveny, Mar&cc...
DAS
2010
Springer
14 years 3 months ago
Towards more effective distance functions for word image matching
Matching word images has many applications in document recognition and retrieval systems. Dynamic Time Warping (DTW) is popularly used to estimate the similarity between word imag...
Raman Jain, C. V. Jawahar
DAS
2010
Springer
14 years 3 months ago
Nearest neighbor based collection OCR
Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Coll...
K. Pramod Sankar, C. V. Jawahar, Raghavan Manmatha
DAS
2010
Springer
14 years 3 months ago
Memory-based recognition of camera-captured characters
This paper addresses how to quickly recognize a character pattern using a lot of case examples without learning. Here without learning means just finding the most similar example...
Masakazu Iwamura, Tomohiko Tsuji, Koichi Kise
DAS
2010
Springer
14 years 4 months ago
Analysis of whole-book recognition
Whole-book recognition is a document image analysis strategy that operates on the complete set of a book’s page images, attempting to improve accuracy by automatic unsupervised ...
Pingping Xiu, Henry S. Baird
DAS
2010
Springer
14 years 4 months ago
IAMonDo-database: an online handwritten document database with non-uniform contents
In this paper we present a new database of online handwritten documents with different contents such as text, drawings, diagrams, formulas, tables, lists, and markings. It was de...
Emanuel Indermühle, Marcus Liwicki, Horst Bun...