Sciweavers

101 search results - page 9 / 21
» vldb 2002
Sort
View
VLDB
2002
ACM
106views Database» more  VLDB 2002»
13 years 6 months ago
Approximate Frequency Counts over Data Streams
We present algorithms for computing frequency counts exceeding a user-specified threshold over data streams. Our algorithms are simple and have provably small memory footprints. A...
Gurmeet Singh Manku, Rajeev Motwani
VLDB
2002
ACM
126views Database» more  VLDB 2002»
13 years 6 months ago
ALIAS: An Active Learning led Interactive Deduplication System
Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
Sunita Sarawagi, Anuradha Bhamidipaty, Alok Kirpal...
VLDB
2002
ACM
110views Database» more  VLDB 2002»
13 years 6 months ago
Eliminating Fuzzy Duplicates in Data Warehouses
The duplicate elimination problem of detecting multiple tuples, which describe the same real world entity, is an important data cleaning problem. Previous domain independent solut...
Rohit Ananthakrishna, Surajit Chaudhuri, Venkatesh...
VLDB
2002
ACM
124views Database» more  VLDB 2002»
14 years 7 months ago
Efficient retrieval of similar shapes
Abstract. We propose an indexing technique for the fast retrieval of objects in 2D images based on similarity between their boundary shapes. Our technique is robust in the presence...
Davood Rafiei, Alberto O. Mendelzon
VLDB
2002
ACM
126views Database» more  VLDB 2002»
13 years 6 months ago
Plan Selection Based on Query Clustering
Query optimization is a computationally intensive process, especially for complex queries. We present here a tool, called PLASTIC, that can be used by query optimizers to amortize...
Antara Ghosh, Jignashu Parikh, Vibhuti S. Sengar, ...