A Simple Statistical Algorithm for Biological Sequence Compression

16 years 6 months ago

Download www.csse.monash.edu.au

This paper introduces a novel algorithm for biological sequence compression that makes use of both statistical properties and repetition within sequences. A panel of experts is maintained to estimate the probability distribution of the next symbol in the sequence to be encoded. Expert probabilities are combined to obtain the final distribution. The resulting information sequence provides insight for further study of the biological sequence. Each symbol is then encoded by arithmetic coding. Experiments show that our algorithm outperforms existing compressors on typical DNA and protein sequence datasets while maintaining a practical running time.

Minh Duc Cao, Trevor I. Dix, Lloyd Allison, Chris

Real-time Traffic

Biological Sequence Compression | Computer Graphics | DCC 2007 | Protein Sequence Datasets | Sequence Provides Insight |

claim paper

» Local sequence alignments statistics deviations from Gumbel statistics in the rareevent ta...

» A Block Coding Method that Leads to Significantly Lower Entropy Values for the Proteins an...

» Compressed qGram Indexing for Highly Repetitive Biological Sequences

» Fast algorithms for computing sequence distances by exhaustive substring composition

» A compression algorithm for DNA sequences and its applications in genome comparison

» Effects of LongRange Correlations in DNA on Sequence Alignment Score Statistics

» Simple Compression Code Supporting Random Access and Fast String Matching

» Ngram analysis of 970 microbial organisms reveals presence of biological language models

Post Info
More Details (n/a)

Added	25 Dec 2009
Updated	25 Dec 2009
Type	Conference
Year	2007
Where	DCC
Authors	Minh Duc Cao, Trevor I. Dix, Lloyd Allison, Chris Mears

Comments (0)

Sciweavers

A Simple Statistical Algorithm for Biological Sequence Compression

Biological Sequence Compression | Computer Graphics | DCC 2007 | Protein Sequence Datasets | Sequence Provides Insight |

Explore & Download

Productivity Tools

Sciweavers