Sciweavers

37 search results - page 5 / 8
» Automated Detection and Segmentation of Table of Contents Pa...
Sort
View
ICDAR
2009
IEEE
14 years 2 months ago
Text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis
In this paper we propose a new approach to improve electronic editions of human science corpus, providing an efficient estimation of manuscripts pages structure. In any handwriti...
Vincent Malleron, Véronique Eglin, Hubert E...
WWW
2005
ACM
14 years 8 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
WWW
2009
ACM
14 years 8 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
DRR
2009
13 years 5 months ago
Text-image alignment for historical handwritten documents
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text...
Svitlana Zinger, John Nerbonne, Lambert Schomaker
ICDAR
2009
IEEE
14 years 2 months ago
Text Line Segmentation Based on Morphology and Histogram Projection
Text extraction is an important phase in document recognition systems. In order to segment text from a page document it is necessary to detect all the possible manuscript text reg...
Rodolfo P. dos Santos, Gabriela S. Clemente, Ing R...