Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrap...
— While Internet has enabled us to access a vast amount of online news articles originating from thousands of different sources, the human capability to read all these articles h...
Milos Krstajic, Florian Mansmann, Andreas Stoffel,...
An appreciation of the roles of genre and task is important in understanding how people browse the Web. Genre is characterized by content and form and is intimately linked to the ...
Carolyn R. Watters, Michael A. Shepherd, Forbes J....
The emergence of the Web has made more and more news items available, however only a small subset of these news items are relevant in a decision making process. Therefore decision...
Jethro Borsje, Leonard Levering, Flavius Frasincar
Abstract--This paper provides a simple but effective approach, named ECON, to fully-automatically extract content from Web news page. ECON uses a DOM tree to represent the Web news...
Yan Guo, Huifeng Tang, Linhai Song, Yu Wang 0009, ...