Sciweavers

502 search results - page 24 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
ICDAR
2009
IEEE
13 years 6 months ago
Extraction of Nom Text Regions from Stele Images Using Area Voronoi Diagram
Automatic processing of images of steles is a challenging problem due to the variation in their structures and body text characteristics. In this paper, area Voronoi diagram is us...
Thai V. Hoang, Salvatore Tabbone, Ngoc-Yen Pham
ESWS
2004
Springer
14 years 1 months ago
Learning to Harvest Information for the Semantic Web
Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
IADIS
2004
13 years 9 months ago
A conceptual modeling of multimedia documents
Our research works are interested in the identification and the representation of the semantic structures of multimedia documents. The semantic structure of a multimedia document ...
Mohamed Mbarki, Chantal Soulé-Dupuy
LWA
2008
13 years 10 months ago
Rule-Based Information Extraction for Structured Data Acquisition using TextMarker
Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the ...
Martin Atzmüller, Peter Klügl, Frank Pup...
WWW
2010
ACM
14 years 3 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han