Sciweavers

2677 search results - page 83 / 536
» Extracting Structured Data from Web Pages
Sort
View
WWW
2001
ACM
14 years 9 months ago
Effective Web data extraction with standard XML technologies
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
Jussi Myllymaki
CIDR
2003
125views Algorithms» more  CIDR 2003»
13 years 10 months ago
Crossing the Structure Chasm
It has frequently been observed that most of the world’s data lies outside database systems. The reason is that database systems focus on structured data, leaving the unstructur...
Alon Y. Halevy, Oren Etzioni, AnHai Doan, Zachary ...
VLDB
2004
ACM
95views Database» more  VLDB 2004»
14 years 2 months ago
Combating Web Spam with TrustRank
Web spam pages use various techniques to achieve higher-than-deserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manual...
Zoltán Gyöngyi, Hector Garcia-Molina, ...
COOPIS
1998
IEEE
14 years 11 days ago
Wrapper Generation for Web Accessible Data Sources
There is an increase in the number of data sources that can be queried across the WWW. Such sources typically support HTML forms-based interfaces and search engines query collecti...
Jean-Robert Gruser, Louiqa Raschid, Maria-Esther V...
IUI
2003
ACM
14 years 2 months ago
Dynamic web page authoring by example using ontology-based domain knowledge
Authoring dynamic web pages is an inherently difficult task. We present DESK, an interactive authoring tool that allows the customization of dynamic page generation procedures wit...
José Antonio Macías Iglesias, Pablo ...