Sciweavers

609 search results - page 44 / 122
» Adaptive record extraction from web pages
Sort
View
WWW
2011
ACM
15 years 15 days ago
Web information extraction using Markov logic networks
In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages...
Sandeepkumar Satpal, Sahely Bhadra, Sundararajan S...
DOCENG
2010
ACM
15 years 7 months ago
Contextual advertising for web article printing
: Contextual Advertising for Web Article Printing Shengwen Yang, Jianming Jin, Parag Joshi, Sam Liu HP Laboratories HPL-2010-79 printed ad, web printing, article extraction, conte...
Shengwen Yang, Jianming Jin, Joshi Parag, Sam Liu
SPIRE
1999
Springer
15 years 10 months ago
Top-down Extraction of Semi-Structured Data
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
ITCC
2005
IEEE
15 years 11 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
CEAS
2006
Springer
15 years 9 months ago
Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically
Just as email spam has negatively impacted the user messaging experience, the rise of Web spam is threatening to severely degrade the quality of information on the World Wide Web....
Steve Webb, James Caverlee, Calton Pu