Search Sciweavers | Sciweavers

2677 search results - page 29 / 536

» Extracting Structured Data from Web Pages

176

click to vote

WWW
2007
ACM

131views Internet Technology» more WWW 2007»

U-REST: an unsupervised record extraction system

16 years 7 months ago

Download people.csail.mit.edu

In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...

Yuan Kui Shen, David R. Karger

claim paper

Read More »

135

Voted

ICTAI
2008
IEEE

138views Artificial Intelligence» more ICTAI 2008»

Adaptive Mobile Interfaces through Grammar Induction

16 years 1 months ago

Download www.utdallas.edu

This paper presents a grammar-induction based approach to partitioning a Web page into several small pages while each small page fits not only spatially but also logically for mob...

Jun Kong, Kevin L. Ates, Kang Zhang, Yan Gu

claim paper

Read More »

201

click to vote

WWW
2005
ACM

173views Internet Technology» more WWW 2005»

Extracting semantic structure of web documents using content and visual information

16 years 7 months ago

Download www2005.org

This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...

Rupesh R. Mehta, Pabitra Mitra, Harish Karnick

claim paper

Read More »

224

click to vote

KDD
2012
ACM

212views Data Mining» more KDD 2012»

Harnessing the wisdom of the crowds for accurate web page clipping

13 years 9 months ago

Download www.hpl.hp.com

Clipping Web pages, namely extracting the informative clips (areas) from Web pages, has many applications, such as Web printing and e-reading on small handheld devices. Although m...

Lei Zhang, Linpeng Tang, Ping Luo, Enhong Chen, Li...

claim paper

Read More »

169

click to vote

SOFSEM
2007
Springer

156views Theoretical Computer Science» more SOFSEM 2007»

Creating Permanent Test Collections of Web Pages for Information Extraction Research

16 years 29 days ago

Download www.dbai.tuwien.ac.at

In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...

Bernhard Pollak, Wolfgang Gatterbauer

claim paper

Read More »

« Prev « First page 29 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers