Sciweavers

PAA
2006

Efficient median based clustering and classification techniques for protein sequences

14 years 13 days ago
Efficient median based clustering and classification techniques for protein sequences
Abstract In this paper, an efficient K-medians clustering (unsupervised) algorithm for prototype selection and Supervised K-medians (SKM) classification technique for protein sequences are presented. For sequence data sets, a median string/sequence can be used as the cluster/group representative. In K-medians clustering technique, a desired number of clusters, K, each represented by a median string/sequence, is generated and these median sequences are used as prototypes for classifying the new/test sequence whereas in SKM classification technique, median sequence in each group/class of labelled protein sequences is determined and the set of median sequences is used as prototypes for classification purpose. It is found that the K-medians clustering technique outperforms the leader based technique and also SKM classification technique performs better than that of motifs based approach for the data sets used. We further use a simple technique to reduce time and space requirements during p...
P. A. Vijaya, M. Narasimha Murty, D. K. Subramania
Added 14 Dec 2010
Updated 14 Dec 2010
Type Journal
Year 2006
Where PAA
Authors P. A. Vijaya, M. Narasimha Murty, D. K. Subramanian
Comments (0)