Employing Compact Intra-genomic Language Models to Predict Genomic Sequences and Characterize Their Entropy

15 years 4 months ago

Download marco.uminho.pt

Probabilistic models of languages are fundamental to understand and learn the profile of the subjacent code in order to estimate its entropy, enabling the verification and prediction of "natural" emanations of the language. Language models are devoted to capture salient statistical characteristics of the distribution of sequences of words, which transposed to the genomic language, allow modeling a predictive system of the peculiarities and regularities of genomic code in different inter and intra-genomic conditions. In this paper, we propose the application of compact intra-genomic language models to predict the composition of genomic sequences, aiming to achieve valuable resources for data compression and to contribute to enlarge the similarity analysis perspectives in genomic sequences. The obtained results encourage further investigation and validate the use of language models in biological sequence analysis. Keywords language models, DNA entropy estimation, genomic sequen...

Sérgio A. D. Deusdado, Paulo Carvalho

Real-time Traffic

Emerging Technology | Genomic Sequences | Intra-genomic Language Models | ISAMI 2010 | Language Models |

claim paper

Post Info
More Details (n/a)

Added	13 Feb 2011
Updated	13 Feb 2011
Type	Journal
Year	2010
Where	ISAMI
Authors	Sérgio A. D. Deusdado, Paulo Carvalho

Comments (0)

Sciweavers

Employing Compact Intra-genomic Language Models to Predict Genomic Sequences and Characterize Their Entropy

Emerging Technology | Genomic Sequences | Intra-genomic Language Models | ISAMI 2010 | Language Models |

Explore & Download

Productivity Tools

Sciweavers