Background: Percentage Identity (PID) is frequently quoted in discussion of sequence alignments since it appears simple and easy to understand. However, although there are several different ways to calculate percentage identity and each may yield a different result for the same alignment, the method of calculation is rarely reported. Accordingly, quantification of the variation in PID caused by the different calculations would help in interpreting PID values in the literature. In this study, the variation in PID was quantified systematically on a reference set of 1028 alignments generated by comparison of the protein three-dimensional structures. Since the alignment algorithm may also affect the range of PID, this study also considered the effect of algorithm, and the combination of algorithm and PID method.
G. P. S. Raghava, Geoffrey J. Barton