Search Sciweavers | Sciweavers

2677 search results - page 22 / 536

» Extracting Structured Data from Web Pages

207

click to vote

EMNLP
2008

139views Natural Language Processing» more EMNLP 2008»

Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model

15 years 8 months ago

Download www.aclweb.org

Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....

Lei Shi, Ming Zhou

claim paper

Read More »

172

click to vote

CAISE
2003
Springer

120views Information Technology» more CAISE 2003»

Extending an on-line information site with accurate domain-dependent extracts from the World Wide Web

16 years 1 days ago

Download sunsite.informatik.rwth-aachen.de

This paper describes a new procedure that has been developed for extending an existing on-line information system about The Voyages of the Beagle with information collected automat...

Enrique Alfonseca, Pilar Rodríguez

claim paper

Read More »

217

click to vote

SIGMOD
2010
ACM

232views Database» more SIGMOD 2010»

Optimizing content freshness of relations extracted from the web using keyword search

15 years 7 months ago

Download www2.hawaii.edu

An increasing number of applications operate on data obtained from the Web. These applications typically maintain local copies of the web data to avoid network latency in data acc...

Mohan Yang, Haixun Wang, Lipyeow Lim, Min Wang

claim paper

Read More »

184

click to vote

VLDB
2011
ACM

251views Database» more VLDB 2011»

Harvesting relational tables from lists on the web

15 years 1 months ago

Download www.vldb.org

A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...

Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy

claim paper

Read More »

185

click to vote

CIKM
1998
Springer

120views Information Technology» more CIKM 1998»

Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents

15 years 11 months ago

Download pages.cs.wisc.edu

We present a new approach to extracting information from unstructured documents based on an application ontology that describes a domain of interest. Starting with such an ontolog...

David W. Embley, Douglas M. Campbell, Randy D. Smi...

claim paper

Read More »

« Prev « First page 22 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers