Sciweavers

485 search results - page 86 / 97
» Data Warehouse Clustering on the Web
Sort
View
WWW
2008
ACM
14 years 8 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
14 years 8 months ago
Using d-gap patterns for index compression
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
Jinlin Chen, Terry Cook
AIRWEB
2007
Springer
14 years 1 months ago
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...
Baoning Wu, Kumar Chellapilla
SIGMOD
2004
ACM
262views Database» more  SIGMOD 2004»
14 years 7 months ago
The Next Database Revolution
Database system architectures are undergoing revolutionary changes. Most importantly, algorithms and data are being unified by integrating programming languages with the database ...
Jim Gray
SIGMOD
2011
ACM
204views Database» more  SIGMOD 2011»
12 years 10 months ago
Oracle database filesystem
Modern enterprise, web, and multimedia applications are generating unstructured content at unforeseen volumes in the form of documents, texts, and media files. Such content is gen...
Krishna Kunchithapadam, Wei Zhang, Amit Ganesh, Ni...