: We explore whether protein-RNA interfaces differ from non-interfaces in terms of their structural features and whether structural features vary according to the type of the bound RNA (e.g., mRNA, siRNA, etc.), using a non-redundant dataset of 147 protein chains extracted from protein-RNA complexes in the Protein Data Bank. Furthermore, we use machine learning algorithms for training classifiers to predict protein-RNA interfaces using information derived from the sequence and structural features. We develop the Struct-NB classifier that takes into account structural information. We compare the performance of Na¨ıve Bayes and Gaussian Na¨ıve Bayes with that of Struct-NB classifiers on the 147 protein-RNA dataset using sequence and structural features respectively as input to the classifiers. The results of our experiments show that Struct-NB outperforms Na¨ıve Bayes and Gaussian Na¨ıve Bayes on the problem of predicting the protein-RNA binding interfaces in a protein sequen...
Fadi Towfic, Cornelia Caragea, David C. Gemperline