Sciweavers

609 search results - page 58 / 122
» Adaptive record extraction from web pages
Sort
View
WWW
2004
ACM
16 years 6 months ago
Automatic web news extraction using tree edit distance
The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant infor...
Davi de Castro Reis, Paulo Braz Golgher, Altigran ...
ISI
2004
Springer
15 years 11 months ago
Generating Concept Hierarchies from Text for Intelligence Analysis
It is important to automatically extract key information from sensitive text documents for intelligence analysis. Text documents are usually unstructured and information extraction...
Jenq-Haur Wang, Chien-Chung Huang, Jei-Wen Teng, L...
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
16 years 23 days ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
ITCC
2003
IEEE
15 years 11 months ago
Analysis and regeneration of hypermedia contents through Java and XML tools
This paper presents a tool, for the analysis and regeneration of web contents, implemented through XML and Java. At the moment, the web content delivery from server to clients is ...
David Mérida, Ramón Fabregat, Anna U...
IJCAI
2003
15 years 7 months ago
Visual Programming of Web Data Aggregation Applications
Most of the information needs today can be satisfied by searching and browsing the Web. However, repetitive tasks such as monitoring information on Web sites should be done autom...
Robert Baumgartner, Georg Gottlob, Marcus Herzog