Search Sciweavers | Sciweavers

910 search results - page 42 / 182

» Testbed for information extraction from deep web

241

click to vote

SIGMOD
2008
ACM

159views Database» more SIGMOD 2008»

Web-scale extraction of structured data

16 years 4 months ago

Download turing.cs.washington.edu

A long-standing goal of Web research has been to construct a unified Web knowledge base. Information extraction techniques have shown good results on Web inputs, but even most dom...

Michael J. Cafarella, Jayant Madhavan, Alon Y. Hal...

claim paper

Read More »

135

click to vote

WWW
2005
ACM

150views Internet Technology» more WWW 2005»

Extracting context to improve accuracy for HTML content extraction

16 years 5 months ago

Download www1.cs.columbia.edu

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...

Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo

claim paper

Read More »

121

click to vote

BIS
2006

106views Business» more BIS 2006»

Expected Utility of Content Blocks in Web Content Extraction

15 years 5 months ago

Download integror.net

In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...

Marek Kowalkiewicz

claim paper

Read More »

138

click to vote

WEBDB
2010
Springer

156views Database» more WEBDB 2010»

Redundancy-Driven Web Data Extraction and Integration

15 years 9 months ago

Download www.dia.uniroma3.it

A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...

Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...

claim paper

Read More »

128

click to vote

ACL
2003

134views Computational Linguistics» more ACL 2003»

Integrating Information Extraction and Automatic Hyperlinking

15 years 5 months ago

Download acl.ldc.upenn.edu

This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain onto...

Stephan Busemann, Witold Drozdzynski, Hans-Ulrich ...

claim paper

Read More »

« Prev « First page 42 / 182 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers