Sciweavers

2876 search results - page 30 / 576
» A Conceptual-Modeling Approach to Extracting Data from the W...
Sort
View
WWW
2003
ACM
14 years 9 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
WWW
2005
ACM
14 years 9 months ago
Using visual cues for extraction of tabular data from arbitrary HTML documents
We describe a method to extract tabular data from web pages. Rather than just analyzing the DOM tree, we also exploit visual cues in the rendered version of the document to extrac...
Bernhard Krüpl, Marcus Herzog, Wolfgang Gatte...
SIGIR
2004
ACM
14 years 2 months ago
Query-related data extraction of hidden web documents
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
ICDE
2006
IEEE
143views Database» more  ICDE 2006»
14 years 2 months ago
Using Data-Extraction Ontologies to Foster Automating Semantic Annotation
Semantic annotation adds formal metadata to web pages to link web data with ontology concepts. Automated semantic annotation is a primary way of enabling the semantic web. A main ...
Yihong Ding, David W. Embley
WIDM
2003
ACM
14 years 2 months ago
Schema-guided wrapper maintenance for web-data extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant t...
Xiaofeng Meng, Dongdong Hu, Chen Li