Protein sequences classification by means of feature extraction with substitution matrices

14 years 20 days ago

Download www.biomedcentral.com

Background: This paper deals with the preprocessing of protein sequences for supervised classification. Motif extraction is one way to address that task. It has been largely used to encode biological sequences into feature vectors to enable using well-known machine-learning classifiers which require this format. However, designing a suitable feature space, for a set of proteins, is not a trivial task. For this purpose, we propose a novel encoding method that uses amino-acid substitution matrices to define similarity between motifs during the extraction step. Results: In order to demonstrate the efficiency of such approach, we compare several encoding methods using some machine learning classifiers. The experimental results showed that our encoding method outperforms other ones in terms of classification accuracy and number of generated attributes. We also compared the classifiers in term of accuracy. Results indicated that SVM generally outperforms the other classifiers with any encod...

Rabie Saidi, Mondher Maddouri, Engelbert Mephu Ngu

Real-time Traffic

BMCBI 2010 | Protein Sequences | Substitution Matrices | Well-known Machine-learning Classifiers |

claim paper

Post Info
More Details (n/a)

Added	08 Dec 2010
Updated	08 Dec 2010
Type	Journal
Year	2010
Where	BMCBI
Authors	Rabie Saidi, Mondher Maddouri, Engelbert Mephu Nguifo

Comments (0)

Sciweavers

Protein sequences classification by means of feature extraction with substitution matrices

BMCBI 2010 | Protein Sequences | Substitution Matrices | Well-known Machine-learning Classifiers |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers