Sciweavers

684 search results - page 67 / 137
» Extracting semantic structure of web documents using content...
Sort
View
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
14 years 2 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
SEMWEB
2004
Springer
14 years 1 months ago
An Initial Investigation into Querying an Untrustworthy and Inconsistent Web
The Semantic Web is bound to be untrustworthy and inconsistent. In this paper, we present an initial approach for obtaining useful information in such an environment. In particular...
Yuanbo Guo, Jeff Heflin
KDD
2010
ACM
277views Data Mining» more  KDD 2010»
13 years 11 months ago
Growing a tree in the forest: constructing folksonomies by integrating structured metadata
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured ...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
SIGMOD
2010
ACM
250views Database» more  SIGMOD 2010»
13 years 8 months ago
Expressive and flexible access to web-extracted data: a keyword-based structured query language
Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
Jeffrey Pound, Ihab F. Ilyas, Grant E. Weddell
ICDE
2000
IEEE
99views Database» more  ICDE 2000»
14 years 9 months ago
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
This paper describes the methodology and the software development of XWRAP, an XML-enabled wrapper construction system for semi-automatic generation of wrapper programs. By XML-ena...
Ling Liu, Calton Pu, Wei Han