Sciweavers

1260 search results - page 203 / 252
» Data Quality in Genome Databases
Sort
View
PVLDB
2008
99views more  PVLDB 2008»
13 years 7 months ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
ICDE
2011
IEEE
234views Database» more  ICDE 2011»
12 years 11 months ago
How schema independent are schema free query interfaces?
— Real-world databases often have extremely complex schemas. With thousands of entity types and relationships, each with a hundred or so attributes, it is extremely difficult fo...
Arash Termehchy, Marianne Winslett, Yodsawalai Cho...
SIGMOD
2006
ACM
114views Database» more  SIGMOD 2006»
14 years 7 months ago
Ordering the attributes of query results
There has been a great deal of interest in the past few years on ranking of results of queries on structured databases, including work on probabilistic information retrieval, rank...
Gautam Das, Vagelis Hristidis, Nishant Kapoor, S. ...
JSS
2006
78views more  JSS 2006»
13 years 7 months ago
An empirical study of process-related attributes in segmented software cost-estimation relationships
Parametric software effort estimation models consisting on a single mathematical relationship suffer from poor adjustment and predictive characteristics in cases in which the hist...
Juan Jose Cuadrado-Gallego, Miguel-Ángel Si...
KDD
2012
ACM
217views Data Mining» more  KDD 2012»
11 years 10 months ago
The long and the short of it: summarising event sequences with serial episodes
An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard freq...
Nikolaj Tatti, Jilles Vreeken