Sciweavers

1260 search results - page 166 / 252
» Data Quality in Genome Databases
Sort
View
PVLDB
2010
96views more  PVLDB 2010»
13 years 7 months ago
Scalable Data Exchange with Functional Dependencies
The recent literature has provided a solid theoretical foundation for the use of schema mappings in data-exchange applications. Following this formalization, new algorithms have b...
Bruno Marnette, Giansalvatore Mecca, Paolo Papotti
SIGMOD
2008
ACM
100views Database» more  SIGMOD 2008»
13 years 9 months ago
Incorporating string transformations in record matching
Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and ...
Arvind Arasu, Surajit Chaudhuri, Kris Ganjam, Ragh...
SIGMOD
2005
ACM
212views Database» more  SIGMOD 2005»
14 years 9 months ago
A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification
Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find "low cost" changes t...
Philip Bohannon, Michael Flaster, Wenfei Fan, Raje...
SISAP
2011
IEEE
437views Data Mining» more  SISAP 2011»
12 years 11 months ago
Succinct nearest neighbor search
In this paper we present a novel technique for nearest neighbor searching dubbed neighborhood approximation. The central idea is to divide the database into compact regions repres...
Eric Sadit Tellez, Edgar Chávez, Gonzalo Na...
ICDE
1999
IEEE
139views Database» more  ICDE 1999»
14 years 10 months ago
Clustering Large Datasets in Arbitrary Metric Spaces
Clustering partitions a collection of objects into groups called clusters, such that similar objects fall into the same group. Similarity between objects is defined by a distance ...
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehr...