Sciweavers

2677 search results - page 126 / 536
» Extracting Structured Data from Web Pages
Sort
View
INLG
2010
Springer
15 years 12 days ago
Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation
Building NLG systems, in particular statistical ones, requires parallel data (paired inputs and outputs) which do not generally occur naturally. In this paper, we investigate the ...
Anja Belz, Eric Kow
JOT
2008
142views more  JOT 2008»
15 years 2 months ago
Mining Edgar Tender Offers
This paper describes how use the HTMLEditorKit to perform web data mining on EDGAR (Electronic Data-Gathering, Analysis, and Retrieval system). EDGAR is the SEC's (U.S. Secur...
Douglas Lyon
CIKM
2008
Springer
15 years 4 months ago
Intra-document structural frequency features for semi-supervised domain adaptation
In this work we try to bridge the gap often encountered by researchers who find themselves with few or no labeled examples from their desired target domain, yet still have access ...
Andrew Arnold, William W. Cohen
DEXAW
2002
IEEE
107views Database» more  DEXAW 2002»
15 years 7 months ago
Semi-Automated Extraction of Ontological Knowledge from XML Datasources
In the paper we present a methodology for the semiautomated extraction of ontological knowledge from XML data sources in a given domain. We consider an interconnection scenario ov...
Silvana Castano, Valeria De Antonellis, Sabrina De...
ADMA
2008
Springer
151views Data Mining» more  ADMA 2008»
15 years 8 months ago
Link-Contexts for Ranking
Anchor text has been shown to be effective in ranking[6] and a variety of information retrieval tasks on web pages. Some authors have expanded on anchor text by using the words ar...
Jessica Gronski