Sciweavers

708 search results - page 56 / 142
» Identifying Content Blocks from Web Documents
Sort
View
WWW
2005
ACM
14 years 9 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
WWW
2004
ACM
14 years 9 months ago
OntoMiner: bootstrapping ontologies from overlapping domain specific web sites
In this paper, we present automated techniques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant overlapping Web sites prov...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...
HT
2005
ACM
14 years 2 months ago
From the writable web to global editability
The technical and competence requirements for writing content on the web is still one of the major factors that widens the gap between authors and readers. Although tools that sup...
Angelo Di Iorio, Fabio Vitali
PDPTA
2003
13 years 10 months ago
Tuxedo: A Peer-to-Peer Caching System
We are witnessing two trends in Web content access: (a) increasing amounts of dynamic and personalized Web content, and (b) a significant growth in “on-the-move” access using...
Weisong Shi, Kandarp Shah, Yonggen Mao, Vipin Chau...
LWA
2008
13 years 10 months ago
Labeling Clusters - Tagging Resources
In order to support the navigation in huge document collections efficiently, tagged hierarchical structures can be used. Often, multiple tags are used to describe resources. For u...
Korinna Bade, Andreas Nürnberger