We propose an algorithm for estimating the common secondary structure, alignment, and posterior base pairing probabilities for two RNA sequences. A definition of structural alignment is presented based on a novel concept of matched helical regions that generalizes the common secondary structure and alignment constraints used in prior work. A probabilistic framework for scoring structural alignments is developed based on a pseudo free energy model. Utilizing the model, maximum a posteriori probability estimates of secondary structure and alignment, and a posteriori probabilities for base pairing are computed using an efficient dynamic programming algorithm. Experimental results demonstrate that the proposed method offers significant improvements in structure and alignment prediction accuracy in comparison with single sequence thermodynamic methods for secondary structure prediction and purely sequence based alignment.
Arif Ozgun Harmanci, Gaurav Sharma, David H. Mathe