Sciweavers

511 search results - page 67 / 103
» Discovering data dependencies in Web content mining
Sort
View
ACMICEC
2006
ACM
141views ECommerce» more  ACMICEC 2006»
15 years 10 months ago
From HTML documents to web tables and rules
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Kai Simon, Georg Lausen, Harold Boley
IDA
2011
Springer
14 years 11 months ago
A parallel, distributed algorithm for relational frequent pattern discovery from very large data sets
The amount of data produced by ubiquitous computing applications is quickly growing, due to the pervasive presence of small devices endowed with sensing, computing and communicatio...
Annalisa Appice, Michelangelo Ceci, Antonio Turi, ...
CIKM
2009
Springer
15 years 11 months ago
Vetting the links of the web
Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, ...
Na Dai, Brian D. Davison
WWW
2007
ACM
16 years 5 months ago
Towards domain-independent information extraction from web tables
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...
PKDD
2007
Springer
120views Data Mining» more  PKDD 2007»
15 years 10 months ago
Site-Independent Template-Block Detection
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
Aleksander Kolcz, Wen-tau Yih