This paper introduces a novel algorithm for biological sequence compression that makes use of both statistical properties and repetition within sequences. A panel of experts is ma...
Minh Duc Cao, Trevor I. Dix, Lloyd Allison, Chris ...
It is well known that the base composition along eukaryotic genomes is long-range correlated. Here, we investigate the effect of such long-range correlations on alignment score sta...
Philipp W. Messer, Ralf Bundschuh, Martin Vingron,...
Background: The optimal score for ungapped local alignments of infinitely long random sequences is known to follow a Gumbel extreme value distribution. Less is known about the imp...
Stefan Wolfsheimer, Bernd Burghardt, Alexander K. ...
A simple statistical block code in combination with the LZW-based compression utilities gzip and compress has been found to increase by a significant amount the level of compressi...
The study of compressed storage schemes for highly repetitive sequence collections has been recently boosted by the availability of cheaper sequencing technologies and the flood of...