Representing web data into a machine understandable format is a curtail task for the next generation of the web. Most of current web pages are dynamic pages. A large percentage of...
Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...
The structure of most web sites consists of a composition of web pages that require varying amounts of time to render. Typically, web pages with large amount content (text/images/...
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchi...