Search Sciweavers | Sciweavers

2677 search results - page 37 / 536

» Extracting Structured Data from Web Pages

174

Voted

CIKM
2008
Springer

194views Information Technology» more CIKM 2008»

Coreex: content extraction from online news articles

15 years 9 months ago

Download ilpubs.stanford.edu

We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...

Jyotika Prasad, Andreas Paepcke

claim paper

Read More »

188

click to vote

LPNMR
2001
Springer

203views Automated Reasoning» more LPNMR 2001»

Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto

15 years 11 months ago

Download lsirpeople.epfl.ch

Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting informatio...

Robert Baumgartner, Sergio Flesca, Georg Gottlob

claim paper

Read More »

227

click to vote

PKDD
2004
Springer

205views Data Mining» more PKDD 2004»

Breaking Through the Syntax Barrier: Searching with Entities and Relations

16 years 8 days ago

Download www.cse.iitb.ac.in

The next wave in search technology will be driven by the identiﬁcation, extraction, and exploitation of real-world entities represented in unstructured textual sources. Search sy...

Soumen Chakrabarti

claim paper

Read More »

209

click to vote

WWW
2004
ACM

157views Internet Technology» more WWW 2004»

Learning block importance models for web pages

16 years 7 months ago

Download research.microsoft.com

Some previous works show that a web page can be partitioned to multiple segments or blocks, and usually the importance of those blocks in a page is not equivalent. Also, it is pro...

Ruihua Song, Haifeng Liu, Ji-Rong Wen, Wei-Ying Ma

claim paper

Read More »

165

click to vote

JUCS
2008

185views more JUCS 2008»

Recognising Informative Web Page Blocks Using Visual Segmentation for Efficient Information Extraction

15 years 6 months ago

Download www.jucs.org

Abstract: As web sites are getting more complicated, the construction of web information extraction systems becomes more troublesome and time-consuming. A common theme is the diffi...

Jinbeom Kang, Joongmin Choi

claim paper

Read More »

« Prev « First page 37 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers