Fast protein homology and fold detection with sparse spatial sample kernels

16 years 1 months ago

Download www.cs.rutgers.edu

In this work we present a new string similarity feature, the sparse spatial sample (SSS). An SSS is a set of short substrings at specific spatial displacements contained in the original string. Using this feature we induce the SSS kernel (SSSK) which measures the agreement in the SSS content between pairs of strings. The SSSK yields better prediction performance at substantially reduced computational cost than existing algorithms for sequence classification tasks. We show that on the task of predicting the functional and structural classes of proteins, the SSSK results in state-of-the-art performance across several benchmark sets in both supervised and semi-supervised learning settings. The results have immediate practical value for accurate protein superfamily and fold classification and may be similarly extended to other sequence modeling domains.

Pai-Hsi Huang, Pavel P. Kuksa, Vladimir Pavlovic

Real-time Traffic

Computer Vision | ICPR 2008 | Specific Spatial Displacements | SSS Content | SSS Kernel |

claim paper

Post Info
More Details (n/a)

Added	05 Nov 2009
Updated	05 Nov 2009
Type	Conference
Year	2008
Where	ICPR
Authors	Pai-Hsi Huang, Pavel P. Kuksa, Vladimir Pavlovic

Comments (0)

Sciweavers

Fast protein homology and fold detection with sparse spatial sample kernels

Computer Vision | ICPR 2008 | Specific Spatial Displacements | SSS Content | SSS Kernel |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers