Sciweavers

684 search results - page 61 / 137
» Elimination of Redundant Information for Web Data Mining
Sort
View
CINQ
2004
Springer
189views Database» more  CINQ 2004»
14 years 1 months ago
Employing Inductive Databases in Concrete Applications
In this paper we present the application of the inductive database approach to two practical analytical case studies: Web usage mining in Web logs and financial data. As far as co...
Rosa Meo, Pier Luca Lanzi, Maristella Matera, Dani...
INCDM
2010
Springer
125views Data Mining» more  INCDM 2010»
13 years 9 months ago
Web-Site Boundary Detection
Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to...
Ayesh Alshukri, Frans Coenen, Michele Zito
WSDM
2009
ACM
176views Data Mining» more  WSDM 2009»
14 years 2 months ago
The web changes everything: understanding the dynamics of web content
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different...
Eytan Adar, Jaime Teevan, Susan T. Dumais, Jonatha...
FAST
2008
13 years 10 months ago
Avoiding the Disk Bottleneck in the Data Domain Deduplication File System
Disk-based deduplication storage has emerged as the new-generation storage system for enterprise data protection to replace tape libraries. Deduplication removes redundant data se...
Benjamin Zhu, Kai Li, R. Hugo Patterson
WSDM
2009
ACM
188views Data Mining» more  WSDM 2009»
14 years 2 months ago
Is Wikipedia link structure different?
In this paper, we investigate the difference between Wikipedia and Web link structure with respect to their value as indicators of the relevance of a page for a given topic of re...
Jaap Kamps, Marijn Koolen