Sciweavers

468 search results - page 30 / 94
» Automatic Data Extraction from Data-Rich Web Pages
Sort
View
IJCAI
2003
13 years 9 months ago
Information Extraction from Tree Documents by Learning Subtree Delimiters
Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
Boris Chidlovskii
CIKM
2009
Springer
13 years 8 months ago
Improving search engines using human computation games
Work on evaluating and improving the relevance of web search engines typically use human relevance judgments or clickthrough data. Both these methods look at the problem of learni...
Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek ...
SPIRE
1999
Springer
13 years 12 months ago
Top-down Extraction of Semi-Structured Data
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
WWW
2004
ACM
14 years 8 months ago
Time-based contextualized-news browser (t-cnb)
We propose a new way of browsing contextualized-news articles. Our prototype browser system is called a Time-based ContextualizedNews Browser (T-CNB). The T-CNB concurrently and a...
Akiyo Nadamoto, Katsumi Tanaka
VLDB
2011
ACM
251views Database» more  VLDB 2011»
13 years 2 months ago
Harvesting relational tables from lists on the web
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy