In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages...
: Contextual Advertising for Web Article Printing Shengwen Yang, Jianming Jin, Parag Joshi, Sam Liu HP Laboratories HPL-2010-79 printed ad, web printing, article extraction, conte...
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Just as email spam has negatively impacted the user messaging experience, the rise of Web spam is threatening to severely degrade the quality of information on the World Wide Web....