Sciweavers

72 search results - page 3 / 15
» Automatic Selection of Table Areas in Documents for Informat...
Sort
View
IJMSO
2008
149views more  IJMSO 2008»
13 years 7 months ago
Categorisation of web documents using extraction ontologies
: Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based...
Li Xu, David W. Embley
WWW
2010
ACM
14 years 2 months ago
Entity relation discovery from web tables and links
The World-Wide Web consists not only of a huge number of unstructured texts, but also a vast amount of valuable structured data. Web tables [2] are a typical type of structured in...
Cindy Xide Lin, Bo Zhao, Tim Weninger, Jiawei Han,...
DAS
2010
Springer
13 years 6 months ago
Automatic unsupervised parameter selection for character segmentation
A major difficulty for designing a document image segmentation methodology is the proper value selection for all involved parameters. This is usually done after experimentations o...
Georgios Vamvakas, Nikolaos Stamatopoulos, Basilio...
ICDAR
2009
IEEE
14 years 2 months ago
ICDAR 2009 Book Structure Extraction Competition
This paper introduces the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition is to evaluate and compare automatic techniques for deriving structu...
Antoine Doucet, Gabriella Kazai, Bodin Dresevic, A...
SOFSEM
2007
Springer
14 years 1 months ago
Creating Permanent Test Collections of Web Pages for Information Extraction Research
In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...
Bernhard Pollak, Wolfgang Gatterbauer