Sciweavers

GCB
1998
Springer

Combining diverse evidence for gene recognition in completely sequenced bacterial genomes

14 years 3 months ago
Combining diverse evidence for gene recognition in completely sequenced bacterial genomes
Analysis of a newly sequenced bacterial genome starts with identification of protein-coding genes. Functional assignment of proteins requires the exact knowledge of protein N-termini. We present a new program ORPHEUS that identifies candidate genes and accurately predicts gene starts. The analysis starts with a database similarity search and identification of reliable gene fragments. The latter are used to derive statistical characteristics of protein-coding regions and ribosome-binding sites and to predict the complete set of genes in the analyzed genome. In a test on Bacillus subtilis and Escherichia coli genomes, the program correctly identified 93.3% (resp. 96.3%) of experimentally annotated genes longer than 100 codons described in the PIR-International database, and for these genes 96.3% (83.9%) of starts were predicted exactly. Furthermore, 98.9% (99.1%) of genes longer than 100 codons annotated in GenBank were found, and 92.9% (75.7%) of predicted starts coincided with the fea...
Dmitrij Frishman, Andrey A. Mironov, Hans-Werner M
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where GCB
Authors Dmitrij Frishman, Andrey A. Mironov, Hans-Werner Mewes, Mikhail S. Gelfand
Comments (0)