Background: Codon substitution probabilities are used in many types of molecular evolution studies such as determining Ka/Ks ratios, creating ancestral DNA sequences or aligning coding DNA. Until the recent dramatic increase in genomic data enabled construction of empirical matrices, researchers relied on parameterized models of codon evolution. Here we present the first empirical codon substitution matrix entirely built from alignments of coding sequences from vertebrate DNA and thus provide an alternative to parameterized models of codon evolution. Results: A set of 17,502 alignments of orthologous sequences from five vertebrate genomes yielded 8.3 million aligned codons from which the number of substitutions between codons were counted. From this data, both a probability matrix and a matrix of similarity scores were computed. They are 64
Adrian Schneider, Gina Cannarozzi, Gaston H. Gonne