CONTRAlign: Discriminative Training for Protein Sequence Alignment

15 years 23 days ago

Download ai.stanford.edu

In this paper, we present CONTRAlign, an extensible and fully automatic framework for parameter learning and protein pairwise sequence alignment using pair conditional random fields. When learning a substitution matrix and gap penalties from as few as 20 example alignments, CONTRAlign achieves alignment accuracies competitive with available modern tools. As confirmed by rigorous cross-validated testing, CONTRAlign effectively leverages weak biological signals in sequence alignment: using CONTRAlign, we find that hydropathy-based features result in improvements of 5-6% in aligner accuracy for sequences with less than 20% identity, a signal that state-of-the-art hand-tuned aligners have been unable to exploit effectively. Furthermore, when known secondary structure or solvent accessibility are available, such external information is naturally incorporated as additional features within the CONTRAlign framework, yielding additional improvements of up to 1516% in alignment accuracy for low-...

Chuong B. Do, Samuel S. Gross, Serafim Batzoglou

Real-time Traffic

Computational Biology | RECOMB 2006 | Rigorous Cross-validated Testing | State-of-the-art Hand-tuned Aligners | Weak Biological Signals |

claim paper

Post Info
More Details (n/a)

Added	03 Dec 2009
Updated	03 Dec 2009
Type	Conference
Year	2006
Where	RECOMB
Authors	Chuong B. Do, Samuel S. Gross, Serafim Batzoglou

Comments (0)

Sciweavers

CONTRAlign: Discriminative Training for Protein Sequence Alignment

Computational Biology | RECOMB 2006 | Rigorous Cross-validated Testing | State-of-the-art Hand-tuned Aligners | Weak Biological Signals |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers