Search Sciweavers | Sciweavers

167

WWW
2009
ACM

142views Internet Technology» more WWW 2009»

Estimating web site readability using content extraction

16 years 7 months ago

Download www2009.eprints.org

Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality ...

Thomas Gottron, Ludger Martin

claim paper

Read More »

186

click to vote

WWW
2003
ACM

130views Internet Technology» more WWW 2003»

DOM-based content extraction of HTML documents

16 years 7 months ago

Download www.psl.cs.columbia.edu

Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...

Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...

claim paper

Read More »

194

Voted

WWW
2005
ACM

173views Internet Technology» more WWW 2005»

Extracting semantic structure of web documents using content and visual information

16 years 7 months ago

Download www2005.org

This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...

Rupesh R. Mehta, Pabitra Mitra, Harish Karnick

claim paper

Read More »

210

click to vote

WWW
2010
ACM

188views Internet Technology» more WWW 2010»

Exploiting content redundancy for web information extraction

15 years 6 months ago

Download www.comp.nus.edu.sg

We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...

Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...

claim paper

Read More »

144

click to vote

CIKM
2009
Springer

147views Information Technology» more CIKM 2009»

OfCourse: web content discovery, classification and information extraction for online course materials

15 years 7 months ago

Download www.hpl.hp.com

: OfCourse: Web Content Discovery, Classification and Information Extraction for Online Course Materials Yuhong Xiong, Ping Luo, Yong Zhao, Fen Lin, Shicong Feng, Baoyao Zhou, Liw...

Yuhong Xiong, Ping Luo, Yong Zhao, Fen Lin, Shicon...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers