In this paper we present an approach that evolves the consensus sequence [25] for multiple sequence alignment (MSA) with genetic algorithm (GA). We have developed an encoding scheme such that the number of generations needed to find the optimal solution is approximately the same regardless the number of sequences. Instead it only depends on the length of the template and similarity between sequences. The objective function gives a sum-of-pairs (SP) score as the fitness values. We conducted some preliminary studies and compared our approach with the commonly used heuristic alignment program Clustal W. Results have shown that the GA can indeed scale and perform well.
Conrad Shyu, James A. Foster