Sciweavers

92 search results - page 3 / 19
» HTML Pattern Generator--Automatic Data Extraction from Web P...
Sort
View
WWW
2003
ACM
14 years 8 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
WWW
2005
ACM
14 years 8 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
BIBE
2004
IEEE
156views Bioinformatics» more  BIBE 2004»
13 years 11 months ago
GeneWebEx: Gene Annotation Web Extraction, Aggregation, and Updating from Web-Based Biomolecular Databanks
Numerous genomic annotations are currently stored in different web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integ...
Marco Masseroli, Andrea Stella, Natalia Meani, Myr...
WEBDB
2009
Springer
149views Database» more  WEBDB 2009»
14 years 2 months ago
Extracting Route Directions from Web Pages
Linguists and geographers are more and more interested in route direction documents because they contain interesting motion descriptions and language patterns. A large number of s...
Xiao Zhang, Prasenjit Mitra, Sen Xu, Anuj R. Jaisw...
COMPSAC
2003
IEEE
14 years 22 days ago
A Supervised Visual Wrapper Generator for Web-Data Extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper, we propose a novel sch...
Xiaofeng Meng, Haiyan Wang, Dongdong Hu, Chen Li