Sciweavers

3777 search results - page 355 / 756
» Estimating the Quality of Databases
Sort
View
CORR
2004
Springer
144views Education» more  CORR 2004»
15 years 4 months ago
The Google Similarity Distance
Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers the equivalent of `society' is...
Rudi Cilibrasi, Paul M. B. Vitányi
SIGMOD
2004
ACM
163views Database» more  SIGMOD 2004»
16 years 4 months ago
Rank-aware Query Optimization
Ranking is an important property that needs to be fully supported by current relational query engines. Recently, several rank-join query operators have been proposed based on rank...
Ihab F. Ilyas, Rahul Shah, Walid G. Aref, Jeffrey ...
EDBT
2004
ACM
268views Database» more  EDBT 2004»
16 years 4 months ago
DBDC: Density Based Distributed Clustering
Abstract. Clustering has become an increasingly important task in modern application domains such as marketing and purchasing assistance, multimedia, molecular biology as well as m...
Eshref Januzaj, Hans-Peter Kriegel, Martin Pfeifle
PODS
2010
ACM
215views Database» more  PODS 2010»
15 years 9 months ago
An optimal algorithm for the distinct elements problem
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
Daniel M. Kane, Jelani Nelson, David P. Woodruff
ICDE
2010
IEEE
311views Database» more  ICDE 2010»
16 years 1 months ago
Detecting Inconsistencies in Distributed Data
— One of the central problems for data quality is inconsistency detection. Given a database D and a set Σ of dependencies as data quality rules, we want to identify tuples in D ...
Wenfei Fan, Floris Geerts, Shuai Ma, Heiko Mü...