Sciweavers

391 search results - page 22 / 79
» Finding and Extracting Data Records from Web Pages
Sort
View
SPIRE
1999
Springer
14 years 22 days ago
Top-down Extraction of Semi-Structured Data
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
AUSDM
2006
Springer
97views Data Mining» more  AUSDM 2006»
14 years 6 days ago
Tracking the Changes of Dynamic Web Pages in the Existence of URL Rewriting
Crawlers in a knowledge management system need to collect and archive documents from websites, and also track the change status of these documents. However, the existence of URL r...
Ping-Jer Yeh, Jie-Tsung Li, Shyan-Ming Yuan
WWW
2008
ACM
14 years 9 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
ITCC
2005
IEEE
14 years 2 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
AI
2005
Springer
13 years 10 months ago
Integrating Web Content Clustering into Web Log Association Rule Mining
Abstract. One of the effects of the general Internet growth is an immense number of user accesses to WWW resources. These accesses are recorded in the web server log files, which...
Jiayun Guo, Vlado Keselj, Qigang Gao