Monotony of surprise and large-scale quest for unusual words

15 years 22 days ago

Download cepceb.ucr.edu

The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs and related associations or rules is variously pursued in order to compress data, unveil structure, infer succinct descriptions, extract and classify features, etc. In molecular biology, exceptionally frequent or rare words in bio-sequences have been implicated in various facets of biological function and structure. The discovery, particularly on a massive scale, of such patterns poses interesting methodological and algorithmic problems and often exposes scenarios in which tables and synopses grow faster and bigger than the raw sequences they are meant to encapsulate. In previous study, the ability to succinctly compute, store, and display unusual substrings has been linked to a subtle interplay between the combinatorics of the subword of a word and local monotonicities of some scores used to measure the departure from expectation. In this paper, we carry out an extensive analysis of s...

Alberto Apostolico, Mary Ellen Bock, Stefano Lonar

Real-time Traffic

Computational Biology | Infer Succinct Descriptions | RECOMB 2002 | Recurrent Sequence Patterns | Various Probabilistic Models |

claim paper

Post Info
More Details (n/a)

Added	03 Dec 2009
Updated	03 Dec 2009
Type	Conference
Year	2002
Where	RECOMB
Authors	Alberto Apostolico, Mary Ellen Bock, Stefano Lonardi

Comments (0)

Sciweavers

Monotony of surprise and large-scale quest for unusual words

Computational Biology | Infer Succinct Descriptions | RECOMB 2002 | Recurrent Sequence Patterns | Various Probabilistic Models |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers