Search Sciweavers | Sciweavers

502 search results - page 24 / 101

» Extracting Partial Structures from HTML Documents

172

click to vote

ICDAR
2009
IEEE

154views Document Analysis» more ICDAR 2009»

Extraction of Nom Text Regions from Stele Images Using Area Voronoi Diagram

15 years 4 months ago

Download www.cvc.uab.es

Automatic processing of images of steles is a challenging problem due to the variation in their structures and body text characteristics. In this paper, area Voronoi diagram is us...

Thai V. Hoang, Salvatore Tabbone, Ngoc-Yen Pham

claim paper

Read More »

195

click to vote

ESWS
2004
Springer

122views Internet Technology» more ESWS 2004»

Learning to Harvest Information for the Semantic Web

16 years 5 days ago

Download eprints.aktors.org

Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...

Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...

claim paper

Read More »

207

click to vote

IADIS
2004

127views Internet Technology» more IADIS 2004»

A conceptual modeling of multimedia documents

15 years 8 months ago

Download www.iadis.net

Our research works are interested in the identification and the representation of the semantic structures of multimedia documents. The semantic structure of a multimedia document ...

Mohamed Mbarki, Chantal Soulé-Dupuy

claim paper

Read More »

234

click to vote

LWA
2008

220views Software Engineering» more LWA 2008»

Rule-Based Information Extraction for Structured Data Acquisition using TextMarker

15 years 8 months ago

Download ki.informatik.uni-wuerzburg.de

Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the ...

Martin Atzmüller, Peter Klügl, Frank Pup...

claim paper

Read More »

184

click to vote

WWW
2010
ACM

257views Internet Technology» more WWW 2010»

CETR: content extraction via tag ratios

16 years 1 months ago

Download www.cs.illinois.edu

We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...

Tim Weninger, William H. Hsu, Jiawei Han

claim paper

Read More »

« Prev « First page 24 / 101 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers