Benchmarking declarative approximate selection predicates

16 years 6 months ago

Download www.cs.toronto.edu

Declarative data quality has been an active research topic. The fundamental principle behind a declarative approach to data quality is the use of declarative statements to realize data quality primitives on top of any relational data source. A primary advantage of such an approach is the ease of use and integration with existing applications. Over the last few years several similarity predicates have been proposed for common quality primitives (approximate selections, joins, etc) and have been fully expressed using declarative SQL statements. In this paper we propose new similarity predicates along with their declarative realization, based on notions of probabilistic information retrieval. In particular we show how language models and hidden Markov models can be utilized as similarity predicates for data quality and present their full declarative instantiation. We also show how other scoring methods from information retrieval, can be utilized in a similar setting. We then present full...

Amit Chandel, Oktie Hassanzadeh, Nick Koudas, Moha

Real-time Traffic

Database | Declarative Data Quality | Keywords Declarative Data | SIGMOD 2007 | Similarity Predicates |

claim paper

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2007
Where	SIGMOD
Authors	Amit Chandel, Oktie Hassanzadeh, Nick Koudas, Mohammad Sadoghi, Divesh Srivastava

Sciweavers

Benchmarking declarative approximate selection predicates

Database | Declarative Data Quality | Keywords Declarative Data | SIGMOD 2007 | Similarity Predicates |

Explore & Download

Productivity Tools

Sciweavers