Sciweavers

55 search results - page 6 / 11
» Web page sectioning using regex-based template
Sort
View
WWW
2004
ACM
14 years 8 months ago
Sic transit gloria telae: towards an understanding of the web's decay
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
DSN
2009
IEEE
13 years 11 months ago
Efficient resource management on template-based web servers
The most commonly used request processing model in multithreaded web servers is thread-per-request, in which an individual thread is bound to serve each web request. However, with...
Eli Courtwright, Chuan Yue, Haining Wang
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 4 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
ICWE
2004
Springer
14 years 26 days ago
Accelerating Dynamic Web Content Delivery Using Keyword-Based Fragment Detection
The recent trend in the Internet traffic is increasing in requests for dynamic and personalized content. To efficiently serve this trend, several serverside and cache-side fragme...
Daniel Brodie, Amrish Gupta, Weisong Shi
WWW
2009
ACM
14 years 2 months ago
News article extraction with template-independent wrapper
We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...