Sciweavers

JCB
2006

Recognition and Classification of Histones Using Support Vector Machine

13 years 11 months ago
Recognition and Classification of Histones Using Support Vector Machine
Histones are DNA-binding proteins found in the chromatin of all eukaryotic cells. They are highly conserved and can be grouped into five major classes: H1/H5, H2A, H2B, H3, and H4. Two copies of H2A, H2B, H3, and H4 bind to about 160 base pairs of DNA forming the core of the nucleosome (the repeating structure of chromatin) and H1/H5 bind to its DNA linker sequence. Overall, histones have a high arginine/lysine content that is optimal for interaction with DNA. This sequence bias can make the classification of histones difficult using standard sequence similarity approaches. Therefore, in this paper, we applied support vector machine (SVM) to recognize and classify histones on the basis of their amino acid and dipeptide composition. On evaluation through a five-fold cross-validation, the SVMbased method was able to distinguish histones from nonhistones (nuclear proteins) with an accuracy around 98%. Similarly, we obtained an overall >95% accuracy in discriminating the five classes o...
Manoj Bhasin, Ellis L. Reinherz, Pedro A. Reche
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2006
Where JCB
Authors Manoj Bhasin, Ellis L. Reinherz, Pedro A. Reche
Comments (0)