The accurate prediction of enzyme catalytic sites remains an open problem in bioinformatics. Recently, several structure-based methods have become popular; however, few robust sequence-only methods have been developed. In this report, we demonstrate that three different feed forward neural networks, trained on a variety of sequence-based properties, can reliably predict enzyme catalytic sites. To the best of our knowledge, this is only the second report using neural networks to predict catalytic sites, and is the first relying solely on sequence-derived information. Scaled conjugate gradient is used during training of the models. The simplest of the models uses only sequence conservation, diversity of position and residue identity within the input. Surprisingly, model accuracy is largely unaffected when sequence-based predictions of structural properties (i.e. solvent accessibility and secondary structure) are added to the input. A similar lack of improvement is observed when evolutio...
Swati Pande, Amar Raheja, Dennis R. Livesay