Sciweavers

2677 search results - page 51 / 536
» Extracting Structured Data from Web Pages
Sort
View
IPPS
2008
IEEE
14 years 3 months ago
Multi-threaded data mining of EDGAR CIKs (Central Index Keys) from ticker symbols
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Dougal A. Lyon
PODS
2002
ACM
117views Database» more  PODS 2002»
14 years 8 months ago
Monadic Datalog and the Expressive Power of Languages for Web Information Extraction
Research on information extraction from Web pages (wrapping) has seen much activity in recent times (particularly systems implementations), but little work has been done on formal...
Georg Gottlob, Christoph Koch
DKE
2006
122views more  DKE 2006»
13 years 8 months ago
Sampling, information extraction and summarisation of Hidden Web databases
Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users' queries. The majority of these documents are genera...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
AAAI
2007
13 years 11 months ago
From Whence Does Your Authority Come? Utilizing Community Relevance in Ranking
A web page may be relevant to multiple topics; even when nominally on a single topic, the page may attract attention (and thus links) from multiple communities. Instead of indiscr...
Lan Nie, Brian D. Davison, Baoning Wu
DAIS
2006
13 years 10 months ago
PAGE: A Distributed Infrastructure for Fostering RDF-Based Interoperability
This paper shows how to build a scalable, robust and efficient distributed Internet-scale RDF repository, that we name PAGE (Put And Get Everywhere). 1 Motivation In the recent yea...
Emanuele Della Valle, Andrea Turati, Alessandro Gh...