Intelligent information processing systems, such as digital libraries or search engines index web-pages according to their informative content. However, web-pages contain several n...
Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...