Comparative analysis of long DNA sequences by per element information content using different contexts

15 years 7 months ago

Download www.biomedcentral.com

Background: Features of a DNA sequence can be found by compressing the sequence under a suitable model; good compression implies low information content. Good DNA compression models consider repetition, differences between repeats, and base distributions. From a linear DNA sequence, a compression model can produce a linear information sequence. Linear space complexity is important when exploring long DNA sequences of the order of millions of bases. Compressing a sequence in isolation will include information on self-repetition. Whereas compressing a sequence Y in the context of another X can find what new information X gives about Y. This paper presents a methodology for performing comparative analysis to find features exposed by such models. Results: We apply such a model to find features across chromosomes of Cyanidioschyzon merolae. We present a tool that provides useful linear transformations to investigate and save new sequences. Various examples illustrate the methodology, findi...

Trevor I. Dix, David R. Powell, Lloyd Allison, Jul

Real-time Traffic

BMCBI 2007 | Compression Model | Dna Sequences | Long Dna Sequences |

claim paper

» Fast splice site detection using information content and feature reduction

» Kismeth Analyzer of plant methylation states through bisulfite sequencing

» An analysis of the positional distribution of DNA motifs in promoter regions and its biolo...

» XHM A system for detection of potential cross hybridizations in DNA microarrays

» SeeGH A software tool for visualization of whole genome array comparative genomic hybridi...

» Probabilistic base calling of Solexa sequencing data

» CGAT a comparative genome analysis tool for visualizing alignments in the analysis of comp...

» Knowledge Management through Content Interpretation

Post Info
More Details (n/a)

Added	12 Dec 2010
Updated	12 Dec 2010
Type	Journal
Year	2007
Where	BMCBI
Authors	Trevor I. Dix, David R. Powell, Lloyd Allison, Julie Bernal, Samira Jaeger, Linda Stern

Comments (0)

Sciweavers

Comparative analysis of long DNA sequences by per element information content using different contexts

BMCBI 2007 | Compression Model | Dna Sequences | Long Dna Sequences |

Explore & Download

Productivity Tools

Sciweavers