The significant and meaningful fraction of all the potential information residing in the molecules and structures of living systems is unknown. Sets of random molecular sequences o...
David J. Galas, Matti Nykter, Gregory W. Carter, N...
Low-Complexity Regions (LCRs) of biological sequences are the main source of false positives in similarity searches for biological sequence databases. We consider the problem of ï...
Background: Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. Howeve...
Paolo Ferragina, Raffaele Giancarlo, Valentina Gre...
A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions that are approximate repeats of other subsequences, i.e. instance...
Lloyd Allison, Linda Stern, Timothy Edgoose, Trevo...