Sciweavers

85 search results - page 6 / 17
» ECON: An Approach to Extract Content from Web News Page
Sort
View
WWW
2003
ACM
14 years 8 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
14 years 13 days ago
Recognition of Common Areas in a Web Page Using Visual Information: a possible application in a page classification
Extracting and processing information from web pages is an important task in many areas like constructing search engines, information retrieval, and data mining from the Web. Comm...
Milos Kovacevic, Michelangelo Diligenti, Marco Gor...
22
Voted
DEXAW
1999
IEEE
105views Database» more  DEXAW 1999»
13 years 11 months ago
Personalizing the Web Using Site Descriptions
The information overload on the Web has created a great need for efficient filtering mechanisms. Many sites (e.g., CNN and Quicken) address this problem by allowing a user to crea...
Vinod Anupam, Yuri Breitbart, Juliana Freire, Bhar...
PRICAI
2000
Springer
13 years 11 months ago
Extracting Logical Schema from the Web
One of the main limitations when accessing the web is the lack of explicit structure, whose presence may help in understanding data semantics. Schema for web data can be constructe...
Vincenza Carchiolo, Alessandro Longheu, Michele Ma...
WWW
2010
ACM
13 years 7 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...