Sciweavers

609 search results - page 40 / 122
» Adaptive record extraction from web pages
Sort
View
WWW
2005
ACM
16 years 6 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
ACMICEC
2006
ACM
141views ECommerce» more  ACMICEC 2006»
16 years 21 hour ago
From HTML documents to web tables and rules
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Kai Simon, Georg Lausen, Harold Boley
AINA
2009
IEEE
16 years 26 days ago
CUTER: An Efficient Useful Text Extraction Mechanism
In this paper we present CUTER, a system that processes HTML pages in order to extract the useful text from them. The mechanism is focalized on HTML pages that include news articl...
George Adam, Christos Bouras, Vassilis Poulopoulos
APWEB
2006
Springer
15 years 9 months ago
Image Description Mining and Hierarchical Clustering on Data Records Using HR-Tree
Since we can hardly get semantics from the low-level features of the image, it is much more difficult to analyze the image than textual information on the Web. Traditionally, textu...
Congle Zhang, Sheng Huang, Gui-Rong Xue, Yong Yu
IICAI
2003
15 years 7 months ago
Web Usage Mining: Extraction, Maintenance and Behaviour Trends
With the growing popularity of the web, large volumes of data are gathered automatically by Web Servers and collected into access log files. Analysis of such files is generally cal...
Pierre-Alain Laur, Maguelonne Teisseire, Pascal Po...