Sciweavers

738 search results - page 135 / 148
» High-Performance Extensible Indexing
Sort
View
WWW
2005
ACM
14 years 8 months ago
A personalized search engine based on web-snippet hierarchical clustering
In this paper we propose a hierarchical clustering engine, called SnakeT, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hier...
Paolo Ferragina, Antonio Gulli
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 8 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
SIGMOD
2009
ACM
155views Database» more  SIGMOD 2009»
14 years 7 months ago
Efficient top-k algorithms for fuzzy search in string collections
An approximate search query on a collection of strings finds those strings in the collection that are similar to a given query string, where similarity is defined using a given si...
Rares Vernica, Chen Li
SIGMOD
2008
ACM
139views Database» more  SIGMOD 2008»
14 years 7 months ago
Paths to stardom: calibrating the potential of a peer-based data management system
As peer-to-peer (P2P) networks become more familiar to the database community, intense interest has built up in using their scalability and resilience properties to scale database...
Mihai Lupu, Beng Chin Ooi, Y. C. Tay
SIGMOD
2006
ACM
238views Database» more  SIGMOD 2006»
14 years 7 months ago
Continuous monitoring of top-k queries over sliding windows
Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in convent...
Kyriakos Mouratidis, Spiridon Bakiras, Dimitris Pa...