Protein secondary structure prediction is an important step towards understanding the relation between protein sequence and structure. However, most current prediction methods use...
Yan Liu, Jaime G. Carbonell, Judith Klein-Seethara...
Nowadays, the number of protein sequences being stored in central protein databases from labs all over the world is constantly increasing. From these proteins only a fraction has b...
There are today several systems for predicting transmembrane domains in membrane protein sequences. As they are based on different classifiers as well as different pre- and post-p...
— Part of the challenge of modeling protein sequences is their discrete nature. Many of the most powerful statistical and learning techniques are applicable to points in a Euclid...
Abstract—We consider the problem of aligning multiple protein sequences with the goal of maximizing the SP (Sum-of-Pairs) score, when the number of sequences is large. The QOMA (...
—The rapid burgeoning of available protein data makes the use of clustering within families of proteins increasingly important, the challenge is to identify subfamilies of evolut...
Abdellali Kelil, Shengrui Wang, Ryszard Brzezinski
Many basic tasks in computational biology involve operations on individual DNA and protein sequences. These sequences, even when anonymized, are vulnerable to re-identification a...
The common-use gap penalty strategies, constant penalty and affine gap penalty, have been adopted in the traditional three-sequence alignment algorithm which considers the inserti...
Signal finding (pattern discovery) in biological sequences is a fundamental problem in both computer science and molecular biology. Many approaches have been proposed for extract...
We consider the problem of compressibility of protein sequences. Based on an observed genome-scale long-range correlation in concatenated protein sequences from different organism...