Sciweavers

BIOCOMP
2010

Using the Genetic Code Wisdom for Recognizing Protein Coding Sequences

13 years 8 months ago
Using the Genetic Code Wisdom for Recognizing Protein Coding Sequences
We have elaborated a new method of recognizing protein coding sequences in genomic sequences. The method is exploiting a specific way of genetic code degeneration and relations between mutational pressure and selection pressure shaping the amino acid usage in the proteomes. It is based on analyses of correlations in nucleotide occurrence separately in the first, the second and the third putative codon positions using only six matrices 4x4. Small sizes of matrices enable using only a few coding sequences for training the algorithm. The results of the new method were compared with Markov chain methods used in GeneMark for different genomes including DNA strand (leading/lagging) discrimination. There are no arbitrary "cut off" discriminating between coding and noncoding sequences, on the other hand there is a possibility to rank putative coding sequences according to their coding probability what is especially important in looking for small coding ORFs.
Pawel Blazej, Pawel Mackiewicz, Stanislaw Cebrat
Added 21 Mar 2011
Updated 21 Mar 2011
Type Journal
Year 2010
Where BIOCOMP
Authors Pawel Blazej, Pawel Mackiewicz, Stanislaw Cebrat
Comments (0)