Sciweavers

684 search results - page 68 / 137
» Elimination of Redundant Information for Web Data Mining
Sort
View
ACMICEC
2006
ACM
141views ECommerce» more  ACMICEC 2006»
14 years 1 months ago
From HTML documents to web tables and rules
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Kai Simon, Georg Lausen, Harold Boley
PKDD
2004
Springer
91views Data Mining» more  PKDD 2004»
14 years 1 months ago
Summarization of Dynamic Content in Web Collections
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Adam Jatowt, Mitsuru Ishizuka
ICDE
2009
IEEE
392views Database» more  ICDE 2009»
15 years 7 months ago
FF-Anonymity: When Quasi-Identifiers Are Missing
Existing approaches on privacy-preserving data publishing rely on the assumption that data can be divided into quasi-identifier attributes (QI) and sensitive attribute (SA). This ...
Ada Wai-Chee Fu, Ke Wang, Raymond Chi-Wing Wong, Y...
WWW
2007
ACM
14 years 8 months ago
Classifying web sites
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....
Christoph Lindemann, Lars Littig
WWW
2005
ACM
14 years 8 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins