Sciweavers

BMCBI
2007

Comparative analysis of long DNA sequences by per element information content using different contexts

13 years 11 months ago
Comparative analysis of long DNA sequences by per element information content using different contexts
Background: Features of a DNA sequence can be found by compressing the sequence under a suitable model; good compression implies low information content. Good DNA compression models consider repetition, differences between repeats, and base distributions. From a linear DNA sequence, a compression model can produce a linear information sequence. Linear space complexity is important when exploring long DNA sequences of the order of millions of bases. Compressing a sequence in isolation will include information on self-repetition. Whereas compressing a sequence Y in the context of another X can find what new information X gives about Y. This paper presents a methodology for performing comparative analysis to find features exposed by such models. Results: We apply such a model to find features across chromosomes of Cyanidioschyzon merolae. We present a tool that provides useful linear transformations to investigate and save new sequences. Various examples illustrate the methodology, findi...
Trevor I. Dix, David R. Powell, Lloyd Allison, Jul
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Trevor I. Dix, David R. Powell, Lloyd Allison, Julie Bernal, Samira Jaeger, Linda Stern
Comments (0)