Using a flexible representation of biological sequences, we have performed a comparative analysis of 1208 known tRNA sequences. We believe we our technique is a more sensitive method for detecting structural and functional relationships in sets of aligned sequences because we use a flexible representation (for sequences), as well as a general statistical method that can detect a wide range of relationships between positions in a sequence. Our method utilizes functional classifications of the sequence building-blocks (nucleotide bases and amino acids) based on physical or chemical properties. This flexibility in sequence representation improves the significance of finding sequence relationships mediated by the defining property. For example, using a purine/pyrimidine classification, we can detect base-stacking interactions in sets of nucleotide sequences that form base-paired helices. We use several statistical measures, including the χ2-test, Monte Carlo simulation and an information...
Tod M. Klingler, Douglas L. Brutlag