Sciweavers

2677 search results - page 10 / 536
» Extracting Structured Data from Web Pages
Sort
View
AAAI
2000
13 years 10 months ago
Learning the Common Structure of Data
The proliferation of online information sources has accentuated the need for tools that automatically validate and recognize data. We present an efficient algorithm that learns st...
Kristina Lerman, Steven Minton
CN
2007
108views more  CN 2007»
13 years 8 months ago
On the peninsula phenomenon in web graph and its implications on web search
Web masters usually place certain web pages such as home pages and index pages in front of others. Under such a design, it is necessary to go through some pages to reach the desti...
Tao Meng, Hong-Fei Yan
KES
2006
Springer
13 years 8 months ago
Web Site Off-Line Structure Reconfiguration: A Web User Browsing Analysis
The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...
Sebastián A. Ríos, Juan D. Vel&aacut...
VLDB
2004
ACM
121views Database» more  VLDB 2004»
14 years 1 months ago
An Automatic Data Grabber for Large Web Sites
We demonstrate a system to automatically grab data from data intensive web sites. The system first infers a model that describes at the intensional level the web site as a collec...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
ICDE
2004
IEEE
117views Database» more  ICDE 2004»
14 years 10 months ago
Probe, Cluster, and Discover: Focused Extraction of QA-Pagelets from the Deep Web
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...
James Caverlee, Ling Liu, David Buttler