We consider the problem of compressibility of protein sequences. Based on an observed genome-scale long-range correlation in concatenated protein sequences from different organism...
Background: It is a major challenge of computational biology to provide a comprehensive functional classification of all known proteins. Most existing methods seek recurrent patte...
This paper introduces a novel algorithm for biological sequence compression that makes use of both statistical properties and repetition within sequences. A panel of experts is ma...
Minh Duc Cao, Trevor I. Dix, Lloyd Allison, Chris ...
Abstract. The emerging field of synthetic biology moves beyond conventional genetic manipulation to construct novel life forms which do not originate in nature. We explore the pro...
Bei Wang, Dimitris Papamichail, Steffen Mueller, S...
A simple statistical block code in combination with the LZW-based compression utilities gzip and compress has been found to increase by a significant amount the level of compressi...