Discrete profile comparison using information bottleneck

15 years 6 months ago

Download ai.stanford.edu

Sequence homologs are an important source of information about proteins. Amino acid profiles, representing the position-specific mutation probabilities found in profiles, are a richer encoding of biological sequences than the individual sequences themselves. However, profile comparisons are an order of magnitude slower than sequence comparisons, making profiles impractical for large datasets. Also, because they are such a rich representation, profiles are difficult to visualize. To address these problems, we describe a method to map probabilistic profiles to a discrete alphabet while preserving most of the information in the profiles. We find an informationally optimal discretization using the Information Bottleneck approach (IB). We observe that an 80-character IB alphabet captures nearly 90% of the amino acid occurrence information found in profiles, compared to the consensus sequence's 78%. Distant homolog search with IB sequences is 88% as sensitive as with profiles compared ...

Sean O'Rourke, Gal Chechik, Robin Friedman, Eleaza

Real-time Traffic

BMCBI 2006 | Profile | Sequence Comparison | Sequences |

claim paper

» Using Hardware Performance Monitors to Isolate Memory Bottlenecks

» ProfileMe Hardware Support for InstructionLevel Profiling on OutofOrder Processors

» Discovering functional linkages and uncharacterized cellular pathways using phylogenetic p...

» A semiparametric modeling framework for potential biomarker discovery and the development ...

» Comparison of seven methods for producing Affymetrix expression scores based on False Disc...

» Linear predictive coding representation of correlated mutation for protein sequence alignm...

» Detection of distant evolutionary relationships between protein families using theory of s...

» Robust coordination to sustain throughput of an unstable agent network

Post Info
More Details (n/a)

Added	10 Dec 2010
Updated	10 Dec 2010
Type	Journal
Year	2006
Where	BMCBI
Authors	Sean O'Rourke, Gal Chechik, Robin Friedman, Eleazar Eskin

Comments (0)

Sciweavers

Discrete profile comparison using information bottleneck

BMCBI 2006 | Profile | Sequence Comparison | Sequences |

Explore & Download

Productivity Tools

Sciweavers