Sciweavers

173 search results - page 11 / 35
» Using SOFM to Improve Web Site Text Content
Sort
View
WWW
2005
ACM
14 years 8 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
ECWEB
2004
Springer
225views ECommerce» more  ECWEB 2004»
14 years 28 days ago
Accelerating Database Processing at e-Commerce Sites
Abstract. Most e-commerce Web sites dynamically generate their contents through a three-tier server architecture composed of a Web server, an application server, and a database ser...
Seunglak Choi, Jinwon Lee, Su Myeon Kim, Junehwa S...
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 5 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
ICDE
2005
IEEE
260views Database» more  ICDE 2005»
14 years 9 months ago
A Comparative Evaluation of Transparent Scaling Techniques for Dynamic Content Servers
We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong dat...
Cristiana Amza, Alan L. Cox, Willy Zwaenepoel
LREC
2008
160views Education» more  LREC 2008»
13 years 9 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany