Search Sciweavers | Sciweavers

173 search results - page 11 / 35

» Using SOFM to Improve Web Site Text Content

212

Voted

WWW
2005
ACM

150views Internet Technology» more WWW 2005»

Extracting context to improve accuracy for HTML content extraction

16 years 8 months ago

Download www1.cs.columbia.edu

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...

Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo

claim paper

Read More »

239

Voted

ECWEB
2004
Springer

225views ECommerce» more ECWEB 2004»

Accelerating Database Processing at e-Commerce Sites

16 years 25 days ago

Download nclab.kaist.ac.kr

Abstract. Most e-commerce Web sites dynamically generate their contents through a three-tier server architecture composed of a Web server, an application server, and a database ser...

Seunglak Choi, Jinwon Lee, Su Myeon Kim, Junehwa S...

claim paper

Read More »

191

click to vote

WSDM
2010
ACM

215views Data Mining» more WSDM 2010»

Boilerplate Detection using Shallow Text Features

16 years 4 months ago

Download www.wsdm-conference.org

In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...

Christian Kohlschütter, Peter Fankhauser, Wol...

claim paper

Read More »

380

click to vote

ICDE
2005
IEEE

260views Database» more ICDE 2005»

A Comparative Evaluation of Transparent Scaling Techniques for Dynamic Content Servers

16 years 8 months ago

Download www.eecg.toronto.edu

We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong dat...

Cristiana Amza, Alan L. Cox, Willy Zwaenepoel

claim paper

Read More »

209

Voted

LREC
2008

160views Education» more LREC 2008»

Automatic Extraction of Textual Elements from News Web Pages

15 years 8 months ago

Download www.lrec-conf.org

In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...

Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany

claim paper

Read More »

« Prev « First page 11 / 35 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers