Explore or Exploit? Effective Strategies for Disambiguating Large Databases

15 years 4 months ago

Download www.comp.nus.edu.sg

Data ambiguity is inherent in applications such as data integration, location-based services, and sensor monitoring. In many situations, it is possible to “clean”, or remove, ambiguities from these databases. For example, the GPS location of a user is inexact due to measurement errors, but context information (e.g., what a user is doing) can be used to reduce the imprecision of the location value. In order to obtain a database with a higher quality, we study how to disambiguate a database by appropriately selecting candidates to clean. This problem is challenging because cleaning involves a cost, is limited by a budget, may fail, and may not remove all ambiguities. Moreover, the statistical information about how likely database objects can be cleaned may not be precisely known. We tackle these challenges by proposing two types of algorithms. The ﬁrst type makes use of greedy heuristics to make sensible decisions; however, these algorithms do not make use of cleaning information ...

Reynold Cheng, Eric Lo, Xuan Yang, Ming-Hay Luk, X

Real-time Traffic

Cleaning | Cleaning Effectiveness | Database | PVLDB 2010 |

claim paper

» Optimization on active learning strategy for object category retrieval

» Genetic fuzzy systems to evolve coordination strategies in competitive distributed systems

» Speech Emotion Analysis Exploring the Role of Context

» Mining Classification Rules from Datasets with Large Number of ManyValued Attributes

» Discriminative Frequent Pattern Analysis for Effective Classification

» Applying Decay Strategies to Branch Predictors for Leakage Energy Savings

» UNIBASENSE CLEF 2009 Robust WSD Task

» An Assessment of a Metric Space Database Index to Support Sequence Homology

Post Info
More Details (n/a)

Added	30 Jan 2011
Updated	30 Jan 2011
Type	Journal
Year	2010
Where	PVLDB
Authors	Reynold Cheng, Eric Lo, Xuan Yang, Ming-Hay Luk, Xiang Li, Xike Xie

Comments (0)

Sciweavers

Explore or Exploit? Effective Strategies for Disambiguating Large Databases

Cleaning | Cleaning Effectiveness | Database | PVLDB 2010 |

Explore & Download

Productivity Tools

Sciweavers