This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe ...
Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sah...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Despite recent advances in wireless and portable hardware technologies, mobile access to the Web is often laborious. For this reason, several solutions have been proposed to custom...
Leonardo Teixeira Passos, Marco Tulio de Oliveira ...
We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...
This paper addresses the problem of content-based synchronization for robust watermarking. Synchronization is a process of extracting the location to embed and detect the signature...
Hae-Yeoun Lee, Jong-Tae Kim, Heung-Kyu Lee, Young-...