A TOPS diagram is a simplified description of the topology of a protein using a graph where nodes are α-helices and β-strands, and edges correspond to chirality relations and parallel or anti-parallel bonds between strands. We present a matching algorithm between two TOPS diagrams where the likelihood of a match is measured according to previously known matches between complete 3D structures. This totally new 3D training is recorded on transition matrices that count the likelihood that a given TOPS feature, or combination thereof, is replaced by another feature on homologs. The new algorithm outperforms existing ones on a benchmark database. Some biologically significant examples are discussed as well. The method can be used whenever frequencies of edge relationship matches are known, as it is the case for several biopolymer structures.
J. Rocha