Data Quality in Genome Databases

15 years 8 months ago

Download www.hiqiq.com

: Genome databases store data about molecular biological entities such as genes, proteins, diseases, etc. The main purpose of creating and maintaining such databases in commercial organizations is their importance in the process of drug discovery. Genome data is analyzed and interpreted to gain so-called leads, i.e., promising structures for new drugs. Following a lead through the process of drug development, testing, and finally several stages of clinical trials is extremely expensive. Thus, an underlying high quality database is of utmost importance. Due to the exploratory nature of genome databases, commercial and public, they are inaccurate, incomplete, outdated and in an overall poor state. This paper highlights the important challenges of determining and improving data quality for databases storing molecular biological data. We examine the production process for genome data in detail and show that producing incorrect data is intrinsic to the process at the same time highlight com...

Heiko Müller, Felix Naumann

Real-time Traffic