Sciweavers

300 search results - page 22 / 60
» Extracting Patterns and Relations from the World Wide Web
Sort
View
WEBDB
1999
Springer
196views Database» more  WEBDB 1999»
14 years 2 months ago
Web Ecology: Recycling HTML Pages as XML Documents Using W4F
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Arnaud Sahuguet, Fabien Azavant
PVLDB
2010
114views more  PVLDB 2010»
13 years 8 months ago
ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Talel Abdessalem, Bogdan Cautis, Nora Derouiche
HUMAN
2005
Springer
14 years 3 months ago
How to Evaluate the Effectiveness of URL Normalizations
Syntactically different URLs could represent the same web page on the World Wide Web, and duplicate representation for web pages causes web applications to handle a large amount of...
Sang Ho Lee, Sung Jin Kim, Hyo Sook Jeong
VLDB
1997
ACM
94views Database» more  VLDB 1997»
14 years 2 months ago
To Weave the Web
The paper discusses the issue of views in the Web context. We introduce a set of languages for managing and restructuring data coming from the World Wide Web. We present a specifi...
Paolo Atzeni, Giansalvatore Mecca, Paolo Merialdo
ECAI
2008
Springer
14 years 4 days ago
WWW sits the SAT: Measuring Relational Similarity on the Web
Abstract. Measuring relational similarity between words is important in numerous natural language processing tasks such as solving analogy questions and classifying noun-modifier r...
Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuk...