Background: Total sequence decomposition, using the web-based MASIA tool, identifies areas of conservation in aligned protein sequences. By structurally annotating these motifs, the sequence can be parsed into individual building blocks, molecular legos ("molegos"), that can eventually be related to function. Here, the approach is applied to the apurinic/apyrimidinic endonuclease (APE) DNA repair proteins, essential enzymes that have been highly conserved throughout evolution. The APEs, DNase-1 and inositol 5'-polyphosphate phosphatases (IPP) form a superfamily that catalyze metal ion based phosphorolysis, but recognize different substrates. Results: MASIA decomposition of APE yielded 12 sequence motifs, 10 of which are also structurally conserved within the family and are designated as molegos. The 12 motifs include all the residues known to be essential for DNA cleavage by APE. Five of these molegos are sequentially and structurally conserved in DNase-1 and the IPP fa...
Catherine H. Schein, Numan Özgün, Tadahi