Sciweavers

609 search results - page 48 / 122
» Adaptive record extraction from web pages
Sort
View
PVLDB
2010
114views more  PVLDB 2010»
15 years 4 months ago
ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Talel Abdessalem, Bogdan Cautis, Nora Derouiche
SIGIR
2004
ACM
15 years 11 months ago
Block-level link analysis
Link Analysis has shown great potential in improving the performance of web search. PageRank and HITS are two of the most popular algorithms. Most of the existing link analysis al...
Deng Cai, Xiaofei He, Ji-Rong Wen, Wei-Ying Ma
WWW
2006
ACM
16 years 6 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
PKDD
2004
Springer
91views Data Mining» more  PKDD 2004»
15 years 11 months ago
Summarization of Dynamic Content in Web Collections
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Adam Jatowt, Mitsuru Ishizuka
ACL
2006
15 years 7 months ago
URES : an Unsupervised Web Relation Extraction System
Most information extraction systems either use hand written extraction patterns or use a machine learning algorithm that is trained on a manually annotated corpus. Both of these a...
Binyamin Rosenfeld, Ronen Feldman