Search Sciweavers | Sciweavers

502 search results - page 16 / 101

» Extracting Partial Structures from HTML Documents

127

click to vote

WWW
2006
ACM

69views Internet Technology» more WWW 2006»

Robust web content extraction

16 years 7 months ago

Download www2006.org

We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...

Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...

claim paper

Read More »

190

click to vote

COMAD
2009

142views Knowledge Management» more COMAD 2009»

Querying for relations from the semi-structured Web

15 years 7 months ago

Download www.cse.iitb.ac.in

We present a class of web queries whose result is a multi-column relation instead of a collection of unstructured documents as in standard web search. The user specifies the query...

Sunita Sarawagi

claim paper

Read More »

207

click to vote

JCDL
2006
ACM

237views Education» more JCDL 2006»

Automatic extraction of table metadata from digital documents

16 years 19 days ago

Download www.personal.psu.edu

Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and high...

Ying Liu, Prasenjit Mitra, C. Lee Giles, Kun Bai

claim paper

Read More »

177

click to vote

ICDAR
2003
IEEE

167views Document Analysis» more ICDAR 2003»

A Constraint-based Approach to Table Structure Derivation

15 years 12 months ago

Download www.cse.salford.ac.uk

er presents an approach to deriving an abstract geometric model of a table from a physical representation. The technique developed uses a graph of constraints between cells which ...

Matthew Hurst

claim paper

Read More »

164

Voted

KDD
2002
ACM

148views Data Mining» more KDD 2002»

Discovering informative content blocks from Web documents

16 years 7 months ago

Download www.cs.ualberta.ca

In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...

Shian-Hua Lin, Jan-Ming Ho

claim paper

Read More »

« Prev « First page 16 / 101 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers