Sciweavers

DAS
2008
Springer
13 years 9 months ago
The Convergence of Iterated Classification
We report an improved methodology for training a sequence of classifiers for document image content extraction, that is, the location and segmentation of regions containing handwr...
Chang An, Henry S. Baird
DAS
2008
Springer
13 years 9 months ago
A Graphics Image Processing System
Patent document images maintained by the U.S. patent database have a specific format, in which figures and text descriptions are separated into different sections. This makes it d...
Linlin Li, Chew Lim Tan
DOCENG
2007
ACM
13 years 9 months ago
Editing with style
HTML has popularized the use of style sheets, and the advent of XML has stressed the importance of style as a key area complementing document structure and content. A number of to...
Vincent Quint, Irène Vatton
DAS
2010
Springer
13 years 9 months ago
Overlapped text segmentation using Markov random field and aggregation
Separating machine printed text and handwriting from overlapping text is a challenging problem in the document analysis field and no reliable algorithms have been developed thus f...
Xujun Peng, Srirangaraj Setlur, Venu Govindaraju, ...
DAS
2010
Springer
13 years 9 months ago
Investigator name recognition from medical journal articles: a comparative study of SVM and structural SVM
Automated extraction of bibliographic information from journal articles is key to the affordable creation and maintenance of citation databases, such as MEDLINE
Xiaoli Zhang, Jie Zou, Daniel X. Le, George R. Tho...
DOCENG
2005
ACM
13 years 9 months ago
Managing syntactic variation in text retrieval
Information Retrieval systems are limited by the linguistic variation of language. The use of Natural Language Processing techniques to manage this problem has been studied for a ...
Jesús Vilares, Carlos Gómez-Rodr&iac...
DOCENG
2005
ACM
13 years 9 months ago
Generative semantic clustering in spatial hypertext
This paper presents an iterative method for generative semantic clustering of related information elements in spatial hypertext documents. The goal is to automatically organize th...
Andruid Kerne, Eunyee Koh, Vikram Sundaram, J. Mic...
DOCENG
2005
ACM
13 years 9 months ago
Enhancing composite digital documents using XML-based standoff markup
Document representations can rapidly become unwieldy if they try to encapsulate all possible document properties, ranging tract structure to detailed rendering and layout. We pres...
Peter L. Thomas, David F. Brailsford
DOCENG
2005
ACM
13 years 9 months ago
A web-based document harmonization and annotation chain: from PDF to RDF
Thierry Jacquin, Olivier Fambon, Boris Chidlovskii
DOCENG
2005
ACM
13 years 9 months ago
Prefiltering techniques for efficient XML document processing
Chia-Hsin Huang, Tyng-Ruey Chuang, Hahn-Ming Lee