

Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences

14 years 3 months ago
Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences
Background: The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation scheme must be in an urgent need to reduce the gap between the amount of new sequences produced and reliable functional annotation. This work proposes rules for automatically classifying the fungus genes. The approach involves elucidating the enzyme classifying rule that is hidden in UniProt protein knowledgebase and then applying it for classification. The association algorithm, Apriori, is utilized to mine the relationship between the enzyme class and significant InterPro entries. The candidate rules are evaluated for their classificatory capacity. Results: There were five datasets collected from the Swiss-Prot for establishing the annotation rules. These were treated as the training sets. The TrEMBL entries were treated as the testing set. A correct enzyme classification rate of 70% was obtained for the p...
Shih-Hau Chiu, Chien-Chi Chen, Gwo-Fang Yuan, Thy-
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Authors Shih-Hau Chiu, Chien-Chi Chen, Gwo-Fang Yuan, Thy-Hou Lin
Comments (0)